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However,  formal  proofs  can  serve  purposes  other  than  the  presentation  of  evidence.  In 
particular,  a  formal  proof  of  a  proposition  having  the  form,  “for  each  x  there  is  a  y  such  that  the 
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serve  as  programming  languages  -  languages  for  the  formal  description  of  algorithms.  A  proof 
which  describes  an  algorithm  may  be  “executed*  by  use  of  any  of  a  variety  of  procedures 
developed  in  proof  theory. 

A  proof  differs  from  more  conventional  descriptions  of  the  same  algorithm  in  that  it 
formalizes  additional  information  about  the  algorithm  beyond  that  formalized  in  the 
conventional  description.  This  information  expands  the  class  of  transformations  on  the 
algorithm  which  are  amenabel  to  automation.  For  example,  there  is  a  class  of  ’pruning" 
transformations  which  improve  the  computational  efficiency  of  a  natural  deduction  proof 
regarded  as  a  program  by  removing  unneeded  case  analyses.  These  transforations  make  essential 
use  of  dependency  information  which  finds  formal  expression  in  a  proof,  but  not  in  a 
conventional  program.  Pruning  is  particularly  useful  for  removing  redundancies  which  arise 
when  a  general  purpose  algorithm  is  adapted  to  a  special  situation  by  symbolic  execution. 

This  thesis  concerns  (-^computational  uses  of  the  additional  information  contained  in  proofs, 
and  (#  efficient  methods  for  the  representation  and  transformation  of  proofs.  An  extended 
lambda-calculus  is  presented  which  allows  compact  expression  of  the  computationally  significant 
part  of  the  information  contained  in  proofs.  Terms  of  the  calculus  preserve  dependency  data, 
but  can  be  efficiently  executed  by  an  interpreter  of  the  kind  used  for  lambda-calculus  based 
languages  such  as  LISP.  The  calculus  has  been  implemented  on  the  Stanford  Artificial 
Intelligence  Laboratory  PDP-10  computer.  Results  of  experiments  on  the  use  of  pruning 
transformations  in  the  specialization  of  a  bin-packing  algorithm  are  reported. 
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Mechanical  procedures  for  the  manipulation  of  formal  proofs  have  played  a  central  role 
in  proof  theory  for  more  than  fifty  years.  However,  such  procedures  have  not  been  widely 
applied  to  computational  problems.  One  reason  for  this  is  that  work  in  computer  science  to 
do  with  formal  proof  systems  has  emphasized  the  use  of  formal  proofs  as  evidence  -  as  tools 
for  automatically  establishing  the  truth  of  propositions.  As  a  consequence  of  this  emphasis, 
the  problem  of  mechanizing  the  construction  of  proofs  has  received  much  attention,  whereas 
the  manipulation  of  proofs  -  that  is,  the  conversion  of  one  form  of  evidence  into  another  -  has 
not. 


However,  formal  proofs  can  serve  purposes  other  than  the  presentation  of  evidence.  In 
particular,  a  formal  proof  of  a  proposition  having  the  form,  "for  each  x  there  is  a  y  such  that 
the  relation  R  holds  between  x  and  y"  provides,  under  the  right  conditions,  a  method  for 
computing  values  of  y  from  values  of  x.  That  is,  such  a  proof  describes  an  algorithm  A  where 
A  satisfies  the  specification  R  in  the  sense  that  for  each  x,  R(x,A(x))  holds.  Thus  formal  proof 
systems  can  serve  as  programming  languages  -  languages  for  the  formal  description  of 
algorithms.  A  proof  which  describes  an  algorithm  may  be  "executed"  by  use  of  any  of  a 
variety  of  procedures  developed  in  proof  theory. 

A  proof  differs  from  more  conventional  descriptions  of  the  same  algorithm  in  that  it 
formalizes  additional  information  about  the  algorithm  beyond  that  formalized  in  the 
conventional  description.  This  information  expands  the  class  of  transformations  on  the 
algorithm  which  arc  amenable  to  automation.  For  example,  there  is  a  class  of  "pruning" 
transformations  which  improve  the  computational  efficiency  of  a  natural  deduction  proof 
regarded  as  a  program  by  removing  unneeded  case  analyses.  These  transformations  make 
essential  use  of  dependency  information  which  finds  formal  expression  in  a  proof,  but  not  in  a 
conventional  program.  Pruning  is  particularly  useful  for  removing  redundancies  which  arise 
when  a  general  purpose  algorithm  is  adapted  to  a  special  situation  by  symbolic  execution. 

This  thesis  concerns  (1)  computational  uses  of  the  additional  information  contained  in 
proofs,  and  (2)  efficient  methods  for  the  representation  and  transformation  of  proofs.  An 
extended  lambda-calculus  is  presented  which  allows  compact  expression  of  the  computationally 
significant  part  of  the  information  contained  in  proofs.  Terms  of  the  calculus  preserve 
dependency  data,  but  can  be  efficiently  executed  by  an  interpreter  of  the  kind  used  for 
lambda-calculus  based  languages  such  as  LISP.  The  calculus  has  been  implemented  on  the 
Stanford  Artificial  Intelligence  laboratory  PDP-10  computer.  Results  of  experiments  on  the 
use  of  pruning  transformations  in  the  specialization  of  a  bin-packing  algorithm  are  reported. 
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Chapter  1 


Introduction 


'fhc  most  obvious  purpose  of  a  proof  is  to  convince  -  to  provide  compelling  evidence  for 
the  truth  of  a  proposition.  A  formal  proof  provides  evidence  of  a  kind  that  can  be 
mechanically  recognized,  and  it  is  in  the  capacity  of  evidence  that  formal  proofs  have  most 
often  been  used  in  computation,  as  for  example  in  automatic  theorem  proving  and  in 
automatic  program  verification  of  the  usual  kind. 

As  a  consequence  of  the  emphasis  on  the  use  of  proofs  as  evidence,  only  two  of  the 
various  operations  which  people  commonly  perform  on  informal  proofs  have  played  a 
significant  role  in  computations  involving  f-  rmal  proofs.  These  operations  are  the 
construction  of  proofs,  and  the  checking  or  recognition  of  proofs.  Operations  which  involve 
the  actual  manipulation  of  existing  proofs,  as  opposed  to  the  manipulation  of  formulas,  are 
not  much  used. 

However,  mechanical  procedures  for  proof  manipulation  have  played  a  central  role  in  the 
subfield  of  mathematical  logic  known  as  proof  theory  for  more  than  fifty  years.  This  thesis 
concerns  applications  of  proof  theoretic  methods  to  computational  problems.  In  particular, 
our  subject  matter  is  the  use  of  formal  proofs  for  the  description  of  algorithms,  and  the 
transformations  on  algorithms  which  are  made  possible  by  this  mode  of  description.  Thus  the 
work  differs  from  most  work  in  computer  science  to  do  with  formal  proofs  both  in  the  use  to 
which  proofs  arc  put,  and  in  the  emphasis  placed  on  the  manipulation  -  in  contrast  to  the 
construction  -  of  proofs. 

The  manner  in  which  proofs  may  be  used  to  express  algorithms  is  as  follows.  Suppose 
that  one  has  a  proof  that  an  object  with  given  properties  exists.  Then  the  proof  can 
sometimes  be  used  to  discover  the  identity  of  a  particular  object  with  those  properties.  If 
restrictions  arc  made  on  the  forms  of  inference  used,  then  it  is  possible  to  guarantee  that  the 
proof  will  (in  one  sense  or  another)  provide  this  additional  information.  For  example,  a 
constructive  proof  of  3x<p(x)  always  "provides”  a  value  v  with  <p(v)  in  the  sense  of  indicating  a 
method  for  computing  v;  the  computation  may  or  may  not  be  feasible  in  practice.  However, 
the  restriction  to  constructivity  is  too  strong.  For  one  tiling,  a  proof  of  3x<p(x)  may  exhibit  a 
value  v  which  satisfies  q>,  but  show  that  qp(v)  holds  by  non-constmctivc  methods.  Also,  if  one 
restricts  the  complexity  of  <p  (for  example,  if  tp  is  a  quantifier  free  formula  of  first  order 
arithmetic),  then  any  classical  proof  of  3x(j>(x)  will  provide  a  realization  in  the  same  sense  and 


1 


by  the  same  formal  methods  as  a  constructive  proof.  (By  a  "realization"  of  an  existential 
statement  3xg>(x)  is  meant  simply  a  value  which  satisfies  the  predicate  <j>.) 

If  an  existence  proof  is  given  in  a  formal  way  -  in  a  way  which  makes  it  suitable  for 
mechanical  manipulation  -  then  one  might  hope  to  mechanize  the  passage  from  the  proof  to 
the  value  realizing  the  existential  statement.  Work  in  proof  theory  has  shown  that  the 
extraction  of  realizations  from  proofs  can  in  fact  Sc  mechanized  for  a  variety  of  formal 
systems  and  in  a  variety  of  ways.  For  example,  Prawitz’s  normalization  procedure  may  be 
used  to  transform  a  natural  deduction  proof  of  an  existential  formula  into  a  direct  proof  of  the 
same  formula  which  will  -  under  rather  general  conditions  -  explicitly  mention  a  realization. 

Now,  if  one  has  a  proof  of  a  formula  of  the  form  Vx3y<p(x,y),  the  methods  from  proof 
theory  mentioned  just  above  can  evidently  be  used  to  compute  a  function  f  with  Vx<p(x,f(x)). 
To  do  this,  simply  apply  the  general  result  Vx3y<p(x,y)  to  the  input  value,  and  then  use 
normalization  (or  whatever  method  one  has  in  hand)  to  extract  a  realization.  Thus  a  proof  of 
a  formula  Vx3y<p(x,y)  serves  the  role  of  a  program  which  computes  a  function  satisfying  the 
"specification”  <p. 

Given  that  proofs  can  be  used  as  programs,  what  is  the  interest  of  this  fact  for  computer 
science  and  for  practical  computing?  One  answer  is  as  follows. 

Existing  programming  languages  arc  for  the  most  part  designed  with  economy  of 
expression  in  mind;  a  program  in  such  a  language  formalizes  exactly  the  information  needed 
for  carrying  out  the  task  at  hand.  A  proof,  on  the  other  hand,  formalizes  a  great  deal  of 
information  which  is  not  essential  for  the  simple  execution  of  a  computation  -  such  as  a 
description  of  the  task  being  performed,  a  verification  of  the  method,  and  an  account  of  the 
dependencies  between  facts  involved  in  the  computation.  The  additional  information 
contained  in  proofs  is  useful  in  the  transformation  of  computing  methods  -  for  example  in 
adapting  methods  to  new  situations.  This  should  not  be  surprising,  since  one  expects  that  the 
data  relevant  to  the  transformation  of  algorithms  will  be  different  and  more  extensive  than  the 
data  needed  for  simple  execution. 

We  shall  be  concerned  with  a  particular  set  of  transformations  on  algorithms  -  called  the 
"pruning  transformations".  These  transformations  remove  redundant  chunks  of  computation 
by  making  use  of  a  kind  of  dependency  information  which  docs  not  appear  in  ordinary 
programs.  For  the  most  part,  the  redundancies  removed  by  pruning  arc  not  to  be  found  in 
proofs  generated  by  people.  Thus  the  pruning  transformations  will  not  be  of  much  use  when 
applied  to  algorithms  as  originally  presented.  However,  proofs  which  result  from  automatic 
processes  lend  to  include  such  redundancies. 
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For  example,  suppose  that  one  has  an  algorithm  A(x)  which  is  to  be  used  in  a  situation 
where  it  is  known  in  advance  that  all  inputs  will  have  a  special  form  given  by  the  term 
t(yj,  .  .  .  yn).  'I'hcn  A  may  be  automatically  adapted  to  perform  efficiently  in  this  special 
situation  by  symbolically  executing  the  code  for  A  on  the  term  t,  and  then  applying 
optimizing  transformations  to  the  result.  (Krshov|1977]  and  Sandewall[Bcckcman,  Haraldsson, 
Oskarsson,  and  Sandcwall,  1976],  among  others,  have  studied  this  method  of  specialization  as 
it  applies  to  ordinary  programs.)  If  A  is  expressed  by  a  proof  FI,  then  the  result  o.' 
symbolically  executing  n  on  the  term  t  will  often  contain  redundancies  of  the  kind  removed 
by  pruning  even  if  11  as  originally  given  contained  no  such  redundancies.  Thus,  the 
effectiveness  of  automatic  specialization  can  be  increased  by  adding  pruning  to  the  arsenal  of 
optimizations  used  in  die  course  of  specialization. 

As  they  stand,  the  standard  mediods  of  proof  theory  arc  not  adequate  for  carrying  out  the 
specialization  of  algorithms  in  a  feasibly  efficient  way.  However,  we  have  devised  methods  for 
the  cxcction  and  pruning  of  proofs  which  overcome  this  problem.  The  methods  have  been 
implemented  in  a  proof  manipulation  system  running  on  the  Stanford  Artificial  Intelligence 
l  aboratory  POP- 10  computer.  As  a  preliminary  empirical  investigation  of  the  usefulness  of 
pruning  in  the  specialization  of  algorithms,  expermiments  on  the  specialization  of  a  bin¬ 
packing  algorithm  have  been  carried  out. 

The  following  topics  arc  treated  in  this  thesis,  listed  in  order  of  decreasing  generality. 

(1)  the  use  of  proofs  for  the  formalization  of  algoridims, 

(2)  optimizing  transformations  on  proofs,  in  particular,  the  pruning  transformations, 

(3)  efficient  implementation  of  operations  on  proofs, 

(4)  die  use  of  pruning  in  the  specialization  of  algorithms,  and 

(5)  the  specialization  of  a  bin-packing  algorithm. 

The  general  objective  of  the  work  is  the  development  of  an  improved  technology  for  the 
manipulation  of  algorithms.  The  use  of  enriched  formal  descriptions  of  algorithms  - 
specifically,  formal  proofs  -  is  a  means  to  this  end. 

The  contents  of  the  thesis  arc  as  follows.  Chapter  2  serves  to  introduce  some  material 
from  proof  theory  which  will  be  needed  in  the  course  of  the  thesis.  In  particular,  we  define 
the  notion  of  a  natural  deduction  proof  system,  and  explain  I’rawitz’s  normalization  procedure. 
Also,  we  present  a  very  simple  example  of  the  use  of  pruning  in  specializing  algorithms.  'ITic 
example  is  intended  to  illustrate  the  central  features  of  the  pruning  transformations  in  a 
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setting  of  minimal  technical  complexity.  Chapter  3  describes  the  methods  which  we  have 
devised  for  the  efficient  execution  and  pruning  of  proofs.  In  chapter  4,  results  of  the  bin- 
packing  experiments  arc  reported.  Chapter  5  sketches  additional  uses  which  might  be  made 
of  the  proof  technology  described  in  chapter  3.  There  are  two  appendices,  each  of  which  is 
intended  primarily  for  readers  with  an  interest  in  traditional  proof  theory.  Ihc  first  concerns 
the  relationship  between  our  methods  and  the  functional  and  realizability  interpretations  of 
Klcene,  Godci,  and  Krciscl.  The  second  appendix  presents  an  example  which  demonstrates 
that  the  features  of  proof  systems  which  arc  of  interest  for  traditional  proof  theory  arc 
different  from  those  which  arc  most  directly  relevant  to  the  computational  use  of  proofs. 

The  remainder  of  this  chapter  is  devoted  to  a  collection  of  general  remarks  .about  the 
work,  and  to  previews  of  matters  which  arc  discussed  in  detail  later  on. 

°  Manipulation  vs.  construction 

It  should  be  emphasized  again  that  the  work  described  in  this  paper  concerns  the 
automatic  manipulation  of  existing  proofs,  and  not  the  automatic  construction  of  new  proofs. 
The  bin  packing  proof  used  in  the  experiments  was  devised  "by  hand",  and  was  entered  by 
hand  into  the  proof  checking  component  of  die  proof  manipulation  system.  If  one  is  able  to 
automate,  fully  or  partially,  the  construction  of  proofs  which  describe  computational  methods, 
then  so  much  the  better.  LUit  such  matters  lie  outside  the  scope  of  this  thesis. 

°  Differences  between  proofs  used  to  describe  computation  and  proofs  used  as  evidence 

It  is  necessary  to  keep  computational  considerations  explicitly  in  mind  when  constructing 
proofs  which  arc  intended  as  descriptions  of  computation.  The  best  proof  of  a  formula 
Vx3y<p(x,y)  according  to  such  standard  criteria  as  brevity,  elegance  or  comprehensibility,  will 
often  embody  a  very  bad  algorithm.  Conversely,  a  proof  of  Vx3y<p(x,y)  which  formalizes  a 
good  algoridun  will  generally  constitute  a  rather  unnatural  way  of  establishing  the  simple  truth 
of  the  formula.  For  the  purposes  of  this  thesis,  proofs  arc  to  be  regarded  as  a  means  of 
formulating  algorithmic  ideas.  In  writing  a  proof  to  be  used  for  solving  a  computational 
problem,  one  follows  the  same  procedure  as  is  used  in  writing  an  ordinary  program.  Namely, 
one  first  devises  a  reasonable  algorithm,  and  afterwards  formalizes  that  algorithm  (as  a  proof). 
If  a  proof  is  given  in  complete  detail,  then  it  includes  a  justification  for  the  correctness  of  the 
algorithm  which  it  formalizes.  As  an  immediate  consequence,  formalization  of  algorithms  by 
proofs  provides  a  means  for  the  mechanical  verification  of  algorithms. 

However,  if  one  wishes  only  to  implement  an  algorithm,  and  not  to  verify  it,  then  the 
proof  describing  the  algorithm  need  not  be  ftihy  formalized.  In  particular,  proofs  of  so-called 
"llarrop  formulas"  can  be  left  out.  The  Harrop  formulas  include  for  example  all  formulas 
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which  lack  occurences  of  the  positive  logical  symbols  V  and  3.  Any  proof  of  a  Harrop 
formula  may  be  omitted  without  destroying  the  computational  usefulness  of  a  proof  in  which 
that  axiom  appears. 

Such  "non-computational”  formulas  do  not  even  need  to  be  true.  A  proof  which  uses 
incorrect  Harrop  formulas  as  axioms  can  be  executed  and  pruned  in  the  same  manner  as  a 
proof  which  is  valid  throughout.  However,  the  func':on  computed  by  the  incorrect  proof  may 
not  satisfy  the  specification  embodied  in  its  end-formula. 

A  formal  proof  which  is  constructed  for  the  purpose  of  describing  an  algorithm  and 
which  makes  free  use  of  Harrop  formulas  as  axioms  will  in  general  contain  only  a  part  of  the 
information  needed  to  establish  the  truth  of  its  end-formula.  ITius  the  formal  proofs  which 
will  concern  us  here  arc  not  proofs  in  the  ordinary  sense  at  all;  they  do  not  supply  -  and  are 
not  intended  to  supply  -  the  evidence  necessary  to  verify  a  proposition.  We  arc  bending  the 
machinery  of  formal  proofs  to  a  dilfercnt  end  than  that  for  which  it  was  originally  intended, 
and  so  can  discard  the  part  of  that  machinery  which  is  irrelevant  to  our  new  purposes. 

°  The  role  of  constructive  methods 

We  restrict  our  attention  in  this  thesis  to  proofs  which  arc  built  up  using  constructively 
valid  inferences,  'lhc  particular  formal  proof  system  used  is  the  natural  deduction  formulation 
of  first  order  logic  as  originally  developed  by  Gcnlzcn(l%9J  and  later  studied  by 
l’rawit/11%5).  To  arrive  at  the  constructive  (or  "intuilionistic")  variant  of  natural  deduction 
from  the  standard  or  classical  natural  deduction  system  for  first  order  logic  ,  one  simply 
removes  one  of  the  inference  rules,  namely  the  rule  which  expresses  the  principle  of  the 
excluded  middle. 

Note  for  the  reader  who  is  unfamiliar  with  intuilionistic  logic:  The  approach  to  the 
foundations  of  mathematics  which  is  known  as  "inluitionism"  or  "constructivism"  was 
originated  by  Brouwer.  According  to  this  approach,  the  subject  matter  of  mathematics  is  not 
an  external  world  of  mathematical  objects,  but  rather  the  world  of  mental  constructions  carried 
out  by  mathematicians.  This  point  of  view  leads  to  a  reinterpretation  of  the  meanings  of  the 
logical  symbols,  and  to  restrictions  on  the  modes  of  inference  which  can  be  employed. 

1  ley  ling  and  later  Gcntzcn  developed  formal  systems  for  representing  contructivc  reasoning. 
It  is  not  our  intention  here  to  give  an  exposition  of  intuitionistn  as  a  philosophical  standpoint; 
the  interested  reader  is  referred  to  van  Oalen  [1973J. 

We  have  chosen  to  use  the  the  constructive  instead  of  the  standard  system  not  because  of 
any  distrust  of  classical  reasoning,  nor  because  non-constructive  proofs  cannot  be  used  to 
describe  algorithms.  Indeed,  the  proofs  which  we  use  to  describe  algorithms  will  in  any  case 


5 


make  use  of  complicated  axioms  (as  explained  in  the  last  section),  and  there  is  no  reason 
whatever  to  require  that  these  axioms  be  constructively  valid,  Thus  the  formulas  which 
appear  in  our  proofs  will  not  in  general  be  constructively  valid;  it  is  only  the  inference  rules 
used  for  manipulating  those  formulas  which  must  be  constructive.  But  further,  even  proofs 
which  make  essential  use  of  non-constructivc  inferences  in  connection  with  non-Harrop 
formulas  can  be  executed  by  methods  similar  to  those  used  for  constructive  proofs.  In 
particular,  many  of  the  methods  of  proof  theory,  including  Prawitz’s  normalization  method, 
apply  to  classical  proofs  as  well  as  to  constructive  proofs,  and  under  certain  conditions  arc 
guaranteed  to  provide  the  same  kind  of  information.  For  example,  normalization  may  be 
used  to  execute  any  (classical)  proof  of  a  formula  Vx3 y<p(\,y)  of  arithmetic  whose  matrix  <p  is 
quantifier  free;  a  value  for  y  will  always  be  supplied  by  normalization  when  any  input  value 
for  x  is  given.  Thus  the  distinction  between  a  proof  which  describes  an  algorithm  and  a  proof 
which  docs  not  is  quite  different  from  the  distinction  between  a  constructive  and  a  non¬ 
constructive  proof. 

However,  the  process  of  fleshing  out  an  algorithm  into  a  proof  from  (possibly  complex) 
Harrop  axioms  appears  to  lead  naturally  to  a  proof  in  which  only  constructive  inferences  arc 
used.  This  at  least  is  our  experience  so  far.  So  for  the  moment,  there  is  no  need  to  look  at 
classical  systems,  and  by  the  restriction  to  constructive  systems  we  arc  able  to  avoid  a  certain 
amount  of  technical  complication. 

°  The  p-calculus 

Traditional  proof  theory  provides  two  kinds  of  methods  for  the  execution  of  proofs. 
First,  there  arc  the  normalization  and  cut-elimination  methods  which  carry  out  the 
computation  indicated  by  a  proof  by  transformation  of  the  proof  itself.  Second,  there  arc  the 
functional  and  realizability  interpretations  which  extract  "code"  of  one  kind  or  another  from 
proofs;  it  is  then  the  code  which  is  executed,  and  not  the  proof  itself. 

Kach  of  these  two  approaches  is  inadequate  for  the  purposes  which  we  have  in  mind  here. 
The  normalization  methods  arc  unsatisfactory  because  they  arc  too  slow.  On  the  other  hand, 
the  methods  which  involve  extraction  of  code  from  proofs  retain  only  the  information  which  is 
needed  for  the  computation  immediately  at  hand;  the  additional  data  needed  for  the  pruning 
transformations  is  lost.  This  would  not  be  a  problem  if  we  only  intended  to  apply  pruning 
transformations  to  proofs  as  they  are  originally  given.  I  lowcver,  the  use  of  proofs  for  the 
specialization  of  algorithms  requires  that  the  additional  data  be  preserved  by  symbolic 
execution. 

Our  solution  to  these  difficulties  involves  the  use  of  an  extended  A -calculus,  which  we 
shall  refer  to  as  the  p-calculus.  The  p-calculus  is  designed  to  provide  expression  for  just  that 


6 


information  contained  in  natural  deduction  proofs  which  is  needed  for  execution  and  for  the 
pruning  operations.  P-calculus  terms  can  be  extracted  from  ordinary  natural  deduction  proofs 
in  a  straight-forward  manner,  and  executed  efficiently  by  an  interpreter  of  the  kind  used  for 
A-calculus  based  languages  such  as  I  .ISP  and  SCHKMK.  Chapter  3  describes  the  p-calculus  in 
detail. 

°  Related  w  'k  in  computer  science 

The  work  described  in  this  thesis  is  related  in  a  general  way  to  work  in  a  variety  of  areas 
of  computer  science.  In  particular,  there  arc  clear  connections  to  code  optimization,  program 
synthesis  and  transformation,  and  to  dependency  directed  reasoning  in  the  sense  of  [London 
1978)  and  [Stallman  &  Sussman  1977).  The  relation  between  the  current  work  and  the  topics 
just  mentioned  is  discussed  in  chapter  5.  In  what  follows,  we  give  a  brief  catalog  of  work 
within  computer  science  which  is  directly  concerned  with  the  extraction  of  information  from 
proofs. 

Green  [1969)  considered  the  problem  of  extracting  information  from  resolution  proofs. 
Itishop)  1970),  Constable)  1971),  and  Martin-1  dt[1979)  -  among  others  -  have  suggested  using 
constructive  proof  systems  as  programming  languages.  Goto)  1979)  has  implemented  Gddcl's 
Dialectica  interpretation  for  intuitionistic  first-order  arithmetic.  lakasu  [1978)  discusses 
computational  uses  of  proofs  in  the  same  system  by  use  of  Gent/en's  )1%9)  cut-elimination 
procedure.  Miglioli  and  Ornaghi  (1980)  describe  a  method  for  executing  sequent  calculus 
proofs  which  differs  from  cut-elimination.  In  (Manna  and  Waldingcr,  1979),  a  method  for 
automatic  synthesis  of  programs  is  described  which  involves  the  simultaneous  construction  of  a 
natural  deduction  proof  of  the  goal  formula  and  of  a  program  which  realizes  that  formula  (in 
a  suitable  sense).  Dates  1 1 979)  develops  a  constructive  "refinement  logic”,  and  shows  how 
programs  can  be  extracted  from  proofs  of  this  logic.  A  Prolog  program  (Kowalski  1974)  is  a 
collection  of  axioms  in  Horn  clause  form.  An  execution  of  a  Prolog  program  consists  of  a 
search  for  a  proof  in  a  restricted  resolution  system.  ITtc  output  is  a  term  extracted  from  the 
proof.  In  practice,  the  output  teim  is  constructed  during  the  search  for  the  proof.  (See 
chapter  5  for  further  comments  concerning  the  work  of  Dates  and  of  Kowalski.) 

It  should  be  emphasized  that  the  aims  of  the  work  described  just  above  differ 
fundamentally  from  the  aims  of  the  work  presented  in  this  thesis.  In  the  former,  formal 
proofs  serve  as  vessels  from  which  computational  contents  of  a  standard  kind  are  extracted. 
In  contrast,  our  concern  is  to  exploit  the  differences  between  proofs  and  conventional 
descriptions  of  computation.  Specifically,  we  will  show  how  new  operations  on  algorithms  can 
be  mechanized  by  making  use  of  the  additional  information  to  be  found  in  proofs. 
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Chapter  2 


Normalization  and  Pruning  of  Natural  Deduction  Proofs 


In  this  chapter,  wc  describe  the  natural  deduction  formalism  (section  2.1),  and  the 
normalization  and  pruning  operations  (sections  2.2,  2.7).  A  very  simple  example  of  the  use  of 
pruning  in  specializing  algorithms  is  given  in  section  2.8.  Our  presentation  of  natural 
deduction  and  of  normalization  follows  standard  lines  (eg  Prawitz[1965|),  except  in  the 
treatment  of  "lemmas"  (section  2.5).  Certain  formal  details  concerning  normalization  are  left 
out,  and  all  results  are  stated  without  proof.  Also,  no  treatment  of  principles  of  induction  is 
given  until  Chapter  3,  where  normalization  and  pruning  are  described  in  formal  detail  as  they 
apply  to  a  computationally  efficient  representation  of  natural  deduction  proofs. 

2.1  Natural  deduction 

Systems  of  natural  deduction  were  originally  developed  by  (ienlz.en|l%9|.  The  notation 
used  here  is  that  of  Praw  itz|  1 9fi5|.  The  reader  is  referred  to  Prawitz|1965|  for  a  more 
discursive  presentation  of  natural  deduction  and  of  a  normalization  procedure  for  natural 
deduction  proofs. 

In  what  follows,  we  describe  the  natural  deduction  formalism  for  intuilionislic  first  order 
logic.  The  formalism  is  defined  with  a  first  order  language  I.  as  a  parameter;  the  class  of 
formulas  which  may  appear  in  a  proof  is  given  by  I..  It  should  be  noted  that  natural 
deduction  differs  from  other  proof  systems  for  intuilionislic  first  order  logic  in  the  kind  of 
structure  which  it  provides  for  representing  proofs,  and  not.  for  example,  in  the  set  of 
theorems  which  it  proves.  It  is  possible  to  translate  back  anil  forth  between  proofs  of  natural 
deduction  and  proofs  of,  say.  the  sequent  calculus  in  a  mechanical  way.  The  advantages  of 
natural  deduction  arc  the  advantages  of  a  good  data  structure  -  a  data  structure  which 
represents  human  reasoning  in  a  comparatively  direct  way.  and  to  which  the  various 
operations  in  which  we  arc  interested  can  be  easily  applied. 

The  notion  of  a  first  order  language  is  defined  in  the  standard  manner,  as  follows.  Wc 

start  with  (I)  an  (infinite)  list  of  variable  symbols.  v,,v, . (2)  a  list  of  constant  symbols 

C|.Cj .  (3)  a  list  of  relation  symbols  . and  (4)  a  list  of  function  symbols 

f(,fj  ....  The  urities  of  the  relation  symbols  and  function  symbols  arc  to  be  specified  as 
part  of  the  definition  of  I..  Terms  of  I.  arc  built  up  from  variable  and  constant  symbols  by 
means  of  function  application  in  the  standard  manner.  The  set  of  formulas  of  I.  is  defined 
by  the  following  inductive  clauses.  (1)  lire  propositional  constant  I'AI.SII  is  a  formula. 


(2)  If  t|,  .  .  .  tj,  arc  tenns,  and  R  is  a  relation  symbol  of  arity  n  then  R(tj,  .  .  .  tn)  is  a 
formula.  (3)  If  P.Q  arc  formulas,  and  x  is  a  variable,  then  (a)  PA Q,  (b)  PVQ,  (c)  PDQ, 
(d)  3xP  are  formulas.  It  is  convenient  for  our  purposes  to  allow  universal  quantification  to 
apply  to  a  vector  of  variables;  thus  we  have  (c)  Vxj.  .  .  .  xnP  is  a  formula  for  any  formula  P 
and  vector  of  distinct  variables  x,,  .  .  .  xR.  (Vxlt  .  .  .  xnP  is  not  an  abbreviation  for 
VX[  Vx2  .  .  .  VxnP.  We  shall  sometimes  use  underlined  characters  to  refer  to  vectors  -  for 
example  x  wi"  refer  to  a  vector  of  variables,  and  t  to  a  vector  of  terms.  We  regard  negation 
as  a  defined  notion;  specifically,  IP  is  to  be  read  as  an  abbreviation  for  the  formula 
P  D  I  AI.SH.  Ihc  notion  of  a  free  occurence  of  a  variable  in  a  formula  is  defined  in  the 
standard  manner. 

A  natural  deduction  proof  takes  the  form  of  a  tree  whose  nodes  arc  labeled  by  formulas, 
by  the  names  of  inference  rules,  and  by  other  information.  This  tree  represents  the  history  of 
a  logical  argument  -  in  particular  it  records  a  scries  of  applications  of  inference  rules  which 
lead  from  the  hypotheses  of  the  argument  (represented  by  leaf  nodes  of  the  tree)  to  its 
conclusion  (represented  by  the  root). 

The  leaves  of  a  natural  deduction  proof  tree  arc  rtf  two  kinds:  axioms  and  assumptions. 
I  hc  truth  of  the  conclusion  of  a  natural  deduction  proof  will  in  general  depend  on  the  truth 
of  the  formulas  which  appear  as  axiom  leaves,  but  may  not  depend  on  the  truth  of  all  of  the 
formulas  which  appear  as  assumption  leaves.  I  hc  reason  for  this  is  that  the  inference  rules 
of  natural  deduction  can  have  the  effect  of  "discharging  assumptions".  For  example,  consider 
the  implication  introduction  rule: 


ADB 

I  his  rule  specifies  that  ADR  can  be  inferred  from  B.  In  addition,  the  rule  indicates  that 
the  set  of  assumptions  upon  which  ADB  depends  is  to  be  computed  by  removing  the  formula 
A  from  the  set  of  assumptions  on  which  B  depends.  (  I  hc  appearance  of  A  in  parentheses  is 
what  specifics  that  the  assumption  A  is  to  be  discharged.  )  Informally,  the  rule  states  that  if  B 
can  be  proved  using  the  assumption  A,  then  ADB  can  be  concluded,  and  this  conclusion  docs 
not  depend  on  A  being  true.  Thus  the  inference  rules  of  natural  deduction  operate  not  just 
on  end- formulas  of  the  subproofs  to  which  they  are  applied,  but  on  additional  information 
contained  in  those  subproofs,  namely,  sets  of  assumptions. 

In  general,  the  formula  attached  to  any  node  in  a  natural  deduction  proof  tree  depends 
on  some  (possibly  empty)  subcollcction  of  the  formulas  attached  to  assumption  leaves  of  the 
subtree  rooted  at  that  node.  Ihc  members  of  this  subcollcction  arc  referred  to  as  the  "open 
assumptions"  of  the  node.  The  inference  rules  specify  what  conclusions  may  be  drawn  from 
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premises  of  a  given  form,  and  in  addition  indicate  how  the  open  assumptions  of  the 
conclusion  arc  computed  from  the  open  assumptions  of  the  premises. 

The  set  of  open  assumptions  of  each  node  in  a  proof  tree  is  computed  recursively  as 
follows.  First  of  all,  the  set  of  open  assumptions  of  a  leaf  node  is  the  empty  set  if  the  node 
is  an  axiom,  and  the  singleton  set  containing  the  node  itself  if  the  node  is  an  assumption.  The 
set  of  open  assumptions  of  any  non-leaf  node  can  b'*  computed  from  open  assumptions  of  its 
sons  simply  by  appying  the  inference  rule  associated  with  that  node. 

Note  that  we  use  the  phrase  "open  assumptions”  to  refer  to  a  set  of  nodes  on  a  proof 
tree,  and  not  to  the  set  of  formulas  attached  to  those  nodes. 

Each  of  the  inference  rules  of  natural  deduction  has  the  following  form: 


(Aj)  (A2)  .  .  .  (A„) 


C 


In  the  above,  some  (or  all)  of  the  Pj  may  lack  associated  appearances  of  parenthesized 
formulas  (Aj).  The  meaning  of  such  a  rule  is  that  a  conclusion  of  form  C  can  be  derived 
from  premise  formulas  of  forms  Pj  .  .  .  P|V  The  set  of  open  assumptions  of  he  conclusion  is 
computed  as  follows,  l  et  Sj  be  the  set  of  open  assumptions  of  premise  Pj.  For  each  i, 
remove  from  Sj  all  nodes  whose  attached  formula  is  Aj,  and  call  the  result  Sj'.  (If  there  is  no 
Aj  associated  with  Pj,  then  let  Sj'  =  Sj.)  The  set  of  open  assumptions  of  the  conclusion  is 
just  the  union  of  the  Sj'.  The  Aj  arc  called  the  assumptions  discharged  by  the  rule. 

Each  of  the  inference  rules  of  natural  deduction  is  devoted  to  the  treatment  of  a 
particular  logical  symbol  or  quantifier.  Conversely,  for  each  logical  symbol  and  quantifier, 
there  is  a  rule  (or  pair  of  rules)  which  has  the  effect  of  introducing  that  symbol,  and  another 
rule  (or  pair  of  rules)  which  has  the  effect  of  eliminating  that  symbol.  The  rules  arc 
designated  by  the  symbol  which  they  treat,  and  by  their  function,  whether  it  be  introduction 
or  elimination.  For  example,  the  two  rules  which  treat  implication  arc  referred  to  as  the  "D- 
introduction  rule"  and  the  "D-climination  rule"  ("Dl"  and  "DH"  for  short). 

flic  inference  ailcs  of  natural  deduction  arc  given  below.  We  use  the  following  notation 
for  substitution:  Ajx,_(|  or  A(x<-t|  denotes  the  result  of  replacing  all  occurences  of  the  variable 
x  by  the  term  t  in  the  formula  A.  If  x  and  t  arc  vectors  of  variables  of  the  same  length,  then 
A[x*-tj  denotes  the  result  of  substituting  the  terms  t  for  the  variables  x  in  parallel.  As  usual, 
substitution  may  require  that  bound  variables  be  renamed. 
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A-inlroduction: 


A  B 
AAB 


A-climination: 

AAB  AAB 

A  B 


V- introduction: 

A  B 

AVB  AVB 


V-climination: 

(A)  (B) 

AVB  C  C 

C 


D-introducfion: 

(A) 

B 

ADB 


D-climination: 
A  ADB 


B 


V-introduction: 

A 

V*A  condition:  none  of  the  variables  x  may  appear  free  in  any 

assumption  on  which  the  premise  A  depends. 
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V-climination: 

VxA 

A[x«-t]  where  t  is  any  vector  of  terms  of  L 

3-introduction: 

jVtj 

3xA  where  t  is  any  term  of  L 


conditions:  the  variable  x  may  not  appear  free  in  A,  nor  in  C,  nor  in 
any  assumption  on  which  the  second  premise  C  depends 
other  than  the  assumption  A. 

The  above  rules  are  essentially  Prawitz’s  rules  for  the  intuitionistic  predicate  calculus. 
However,  we  have  left  out  the  FALSH-elimination  rule: 

FALSE-elimination: 

FAL.SE 

A 

The  effect  of  this  mle  can  be  obtained  by  the  use  of  axioms  of  the  form  FALSE  D  A  for 
atomic  formulas  A.  (Any  formula  can  be  derived  from  FALSE  by  means  of  such  axioms  and 
the  use  of  introduction  rules.  For  example,  AVI3  with  A  atomic  may  be  derived  from 
FAL.SE  by  using  the  axiom  FALSE  D  A,  and  then  applying  V-introduction.)  As  will  be  seen 
(section  2.3),  we  shall  allow  such  "falsc-elimination"  axioms  to  appear  in  proofs  used  for 
computation;  in  fact,  the  restriction  that  the  consequent  A  be  atomic  may  be  weakened  -  A 
may  be  any  "Harrop"  formula  (section  2.3). 


3-elimination: 


C 
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'Hie  classical  fust  order  predicate  calculus  is  arrived  at  by  adding  the  following  inference 
rule  expressing  the  principle  of  the  excluded  middle  (recall  that  1A  abbreviates  A  D  FALSK). 

1-elimination: 

OA) 

FALSE 

A 

Notice  that  free  variables  which  appear  in  axioms  are  in  effect  universally  quantified;  the 
same  conclusions  can  be  drawn  from  an  axiom  A(xt,  .  .  .  xn)  in  which  the  x(  appear  free  as 
from  die  axiom  Vx(  x2  .  .  .  xn  A(x,,  .  .  .  xn). 

The  V-introduction  and  3-elimination  inferences  bind  variables  in  a  proof  in  the  same 
sense  that  the  quantifiers  V  and  3  bind  variables  in  a  formula.  Specifically,  the  variables  x  in 
the  above  presentation  of  the  ^-introduction  rule  are  to  be  regarded  as  bound  wherever  they 
occur  in  the  proof  of  the  premise  of  the  rule.  Similarly,  the  variable  x  in  the  3-introfluct^on 
rule  is  to  be  regarded  as  bound  in  the  proof  of  the  rule's  second  premise.  In  both  formulas 
and  proofs,  a  bound  variable  serves  as  a  local  name  which  is  meaningful  only  inside  the  scope 
of  the  binding;  such  bound  variables  may  be  renamed  at  will  without  changing  the  meaning 
of  a  formula  or  proof  (as  long  as  conflicts  with  other  variable  names  arc  avoided).  A  precise 
definition  of  the  notion  of  a  bound  variable  in  a  proof  will  be  given  in  chapter  3. 

by  a  "closed  proof”  we  mean  a  proof  in  which  no  variables  occur  free,  and  in  which  the 
end- formula  depends  on  no  assumption.  Formulas  which  are  not  closed  may  appear  in  a 
closed  proof,  as  long  as  the  free  variables  in  those  formulas  arc  bound  by  one  of  the  inference 
rules  V-introduction  and  3-climination. 

The  following  is  a  simple  example  of  a  natural  deduction  proof.  The  proof  makes  use  of 
no  axioms.  Assumption  leaves  of  the  proof  tree  appear  in  brackets.  The  reader  can  verify 
that  each  of  the  assumptions  is  discharged  in  the  course  of  the  proof.  ITie  result  is  an 
assumption  free  derivation  of  the  predicate  caluclus  theorem, 
Vy(l>(y)VQ(y))  D  Vx(Q(x)VP(x)). 

[Vy(l'(y)VQ(y))]  [P(x)| 

VF - - -  VI - - 

P (x)VQ(x)  Q(x)VP(x) 

VF - 

Q(x)VP(x) 

Vi - 

Vx(Q(x)VP(x)) 

D| - - - — 

Vy(P(y)VQ(y))  D  Vx(Q(x)VP(x)) 


[Q(x)| 

VI - 

Q(x)VP(x) 


2.2  Normalization 


In  the  course  of  this  thesis  we  will  have  occasion  to  consider  several  procedures  for  the 
step-by-step  reduction  of  objects  to  a  "normal”  form.  'ITtesc  "normalization"  procedures  share 
certain  general  features.  This  section  introduces  the  basic  notions  and  terminology  which 
apply  to  normalization  in  each  of  its  various  forms. 

The  two  standard  normalization  procedures  which  arc  most  directly  relevant  to  our 
current  purposes  arc  the  proof  normalization  procedure  of  Prawitz,  and  the  normalization 
procedure  for  Church’s!  1941]  \ -calculus.  The  methods  described  in  chapter  3  make  essential 
use  of  the  close  connection  between  these  two  procedures. 

Let  T  be  a  class  of  terms  (of  whatever  kind).  A  normalization  procedure  for  T  is  (partly) 
given  by  a  collection  R  of  "small"  transformations,  called  "reduction  rules".  The 
normalization  of  a  term  t  consists  of  repeated  application  of  the  reduction  rules  until  no 
further  application  of  a  rule  is  possible.  The  result  of  this  process  (if  it  terminates)  is  called  a 
"normal  form  of  t",  and  is  designated  by  ]t|. 

More  precisely  given  a  term  t  and  a  reduction  rule  r,  r  may  or  may  not  be  applicable  to  t. 
If  r  is  applicable  to  t,  it  may  he  applicable  in  various  ways  (in  the  case  of  proofs  and  X-tcrms, 
the  reduction  rule  may  be  applicable  at  several  places  within  the  proof  or  term).  The  result  of 
applying  a  reduction  rule  in  a  particular  way  to  a  term  t  is  a  modified  term  t'.  A  term  to 
which  no  reduction  rule  is  applicable  is  said  to  be  in  normal  form.  A  pair<T,R>  where  T  is  a 
set  of  terms  and  R  a  set  of  redulion  rules  on  those  terms  will  be  referred  to  as  a  "reduction 
system”. 

We  use  the  notation  t[  — *(■>  to  signify  that  t-,  results  from  an  application  of  one  of  the 
reduction  rules  to  t(.  Any  procedure  for  selecting  a  particular  order  (and  "way")  in  which 
reductions  arc  to  be  applied  to  a  term  is  called  a  "normalization  procedure".  Thus  a 
normalization  procedure,  when  applied  to  any  particular  term  t  generates  a  (possibly  infinite) 
sequence  of  terms  t0,t|.t,  .  .  .  where  tj+  ^  is  arrived  at  from  t|  by  the  application  of  one  of  the 
reduction  rules.  A  theorem  which  states  that  a  given  normalization  procedure  always  yields  a 
finite  sequence  of  terms  t(  ,t,,  .  .  .  tn.  where  t(|  is  in  normal  form,  regardless  of  the  initial 
term  t(,  is  referred  to  as  a  "normalization  theorem".  Other  standard  terminology  concerning 
normalization  is  as  follows. 

°  A  system  <T,R>  has  the  "termination"  property  if  every  sequence  of  reductions  t ( ,t2  ...  is 
finite. 

°  We  use  the  notation  t  — »*  t'  to  signify  that  t’  results  from  t  by  some  finite  sequence 
t — * 1 1  — *  t? — ►  .  .  .  l'  of  applications  of  reduction  rules.  A  system  <T,R>  has  the  "uniqueness 
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property"  if  every  sequence  of  reductions  of  a  term  to  a  normal  form  yields  the  same  result. 
That  is,  <T,R>  has  the  uniqueness  property  if,  whenever  t  -**  tj  and  t  -**  tj,  and  tt  and  ^ 
arc  botlr  normal,  then  tj  ~  t2. 

°  A  system  which  has  both  the  termination  and  the  uniqueness  properties  is  said  to  have 
the  "strong  normalization"  property.  Evidently,  if  a  system  has  the  strong  normalization 
property,  then  the  normal  form  |t|  of  each  term  exists  and  is  unique. 

Each  of  the  computation  procedures  to  be  considered  in  the  course  of  this  thesis  takes  the 
form  of  a  normalization  procedure  of  one  kind  or  another.  Of  course,  normalization 
procedures  need  not  be  implemented  in  a  literal  minded  way.  Normalization  for  A-calculus 
based  languages  can  be  sped  up  by  using  environments  instead  of  literal  substitution  for 
carrying  out  X-convcrsions.  The  implemented  p-calculus  interpreter  on  which  the 
experiments  were  carried  out  makes  use  of  this  idea. 


2.3  Computing  using  proof  normalization 

This  section  concerns  the  manner  in  which  proof  normalization  may  be  used  for 
computing,  and  not  the  internal  workings  of  the  normalization  procedure  itself. 

The  usefulness  of  proof  normalization  for  computational  purposes  derives  from  the  special 
properties  possessed  by  proofs  which  are  in  normal  form.  Roughly  speaking,  the  reductions 
used  in  proof  normalization  have  the  effect  of  removing  certain  kinds  of  indirect  arguments 
from  a  proof.  A  normal  proof  contains  none  of  these  indirect  forms  of  argument,  and 
computationally  useful  information  can  be  read  off  a  proof  which  is  direct  in  this  sense. 

Evidently,  some  restriction  must  be  made  on  the  axioms  which  appear  in  a  proof  if  it  is 
to  be  of  tiny  computational  use.  1'hc  appropriate  restriction  for  our  purposes  is  that  all  axioms 
be  "llarrop  formulas".  The  Harrop  formulas  arc  those  which  do  not  contain  the  positive 
logical  symbols  V  and  3  except  in  the  hypotheses  of  implications.  More  formally,  the  class  of 
1  larrop  formulas  is  defined  hy  the  following  inductive  clauses:  (a)  atomic  formulas  arc  Harrop 
formulas,  (b)  if  A  and  15  arc  llarrop  formulas,  then  so  are  A  A 15,  VxA,  (c)  if  15  is  a  Harrop 
formula,  then  so  is  A  D  15,  regardless  of  the  form  of  A.  A  proof  which  contains  only  Harrop 
formulas  as  axioms  will  be  referred  to  as  a  llarrop  proof. 

(T  he  notion  of  a  Harrop  formula  was  introduced  by  llarrop(l%0|.  l  larrop  showed  that  if 
A  and  3xl5(x)  are  closed  formulas,  and  if  A  is  llarrop,  then  AD3xl)(x)  is  provable  in 
intuitionistic  arithmetic  iff  3x(AD15(x))  is  provable  in  intuitionistic  arithmetic.  This 
generalized  the  following  result  of  K rciselj  1 95S]:  if  A, 15  lack  occurences  of  the  positive 


connectives  "V"  and  "3",  and  if  A,  3xB(x)  are  closed,  then  -  again  -  AD3xB(x)  is  provable 
in  intuitionistic  arithmetic  iff  3x(ADB(x))  is  provable  in  the  same  system.  As  it  happens,  the 
examples  presented  in  chapter  4  effectively  rely  only  on  K  reisel’s  result  and  not  on  Harrop’s 
generalization,  since  all  axioms  used  arc  intuitionistically  equivalent  to  formulas  in  which 
neither  "V"  nor  ”3"  appear.) 

The  following  properties  of  normal  proofs  make  it  possible  to  use  normalization  to  "run" 
proofs. 

(1)  Since  each  of  the  reduction  rules  preserves  the  end-formula  of  the  proof  to  which  it  is 
applied,  the  end-formula  of  the  normal  form  of  a  proof  will  always  be  the  same  as  the  end- 
formula  of  the  original  proof. 

(2)  A  normal,  Harrop  proof  of  an  existential  formula  3xA(x)  has  the  form: 


rr 


3xA(x) 


Thus,  a  normal,  Harrop  proof  of  the  existence  of  an  object  with  a  certain  property 
contains  a  proof  that  a  particular  object  has  that  property,  and  a  term  denoting  that  object  can 
be  easily  extracted  from  the  proof. 

(3)  A  normal,  Harrop  proof  of  a  formula  of  the  form  AVB  has  one  of  the  following  forms: 

n  n 

A  B 

AVB  AVB 

Now,  it  is  evident  that  normalization  allows  one  to  pass  mechanically  from  a  Harrop 
proof  of  Vx3yA(x,y)  and  a  term  t,  to  a  term  t2  together  with  a  proof  rtf  A(t,,t2).  To  do  this, 
one  simply  applies  the  theorem  Vx3yA(x,y)  to  the  value  t(  (by  use  of  the  V  -  elimination 
rule),  and  normalizes  the  resulting  proof.  By  (2)  above,  the  output  value  t?  can  be  extracted 
from  the  next  to  last  step  of  the  normal  proof.  Similarly,  a  closed  Harrop  proof  of 
Vx(A(x)VB(x))  provides  a  uniform  way  of  deciding  which  of  A.B  holds  for  any  particular 
value  of  x. 
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2.4  Proof  normalization 


The  reduction  rules  used  in  Prawit/.’s  normalization  procedure  for  natural  deduction  proofs 
are  given  below.  The  rules  inay  be  applied  at  any  position  in  a  proof  tree,  '['hat  is  to  say.  any 
piece  of  the  proof  tree  which  matches  the  template  given  on  the  left  hand  side  of  a  rule  may 
be  replaced  by  the  appropriate  instantiation  of  the  right  hand  side  of  tine  rule,  and  this 
replacement  constitutes  an  application  of  the  rule.  Notice  that  each  rule  removes  - 
configuration  in  which  an  introduction  rule  is  followed  immediately  by  an  elimination  rule. 


The  following  notation  is  used:  n[x*-t]  denotes  the  result  of  replacing  all  free  occurences 
of  the  variables  x  by  the  terms  t  in  the  proof  FI.  The  figure 

n 

A 


denotes  a  proof  P  which  has  A  as  its  end-formula.  fine  figure 


denotes  the  result  of  replacing  each  open  occurence  of  the  assumption  A  by  the  proof  I~l2 
which  has  A  as  its  end-formula.  In  both  the  substitution  of  terms  for  variables,  and  the 
substitution  of  proofs  for  assumptions,  it  may  be  necessary  to  change  the  names  of  variables 
bound  by  the  V-introduction  and  3-elinnination  inferences;  in  this  respect,  substitution  into 
proofs  is  similar  to  substitution  into  formulas  or  into  A-expressions. 


A  reduction: 

"i  "2 
A  B 

Al -  llj 

A  AB  =>  A 

AH - 

A 


A  B 

A 1 - —  n? 

A  AB  .  =*>  B 

ah: - - 

B 
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V-rcduction: 


n, 


VI 

VE 


VE 


[A]  [B] 


i 

AVB 

"2 

C 

"3 

c 

C 

n. 

B 

I 

[A] 

°2 

[B] 

n3 

AVB 

C 

c 

D-rcductioiv: 


n, 


DE- 


(A) 

n. 


DI- 


A  D  B 


V-rcduction: 


VI- 

VE- 


II 

A 


VxA 

A[x<-t] 


3-reduction: 

n, 


■i 


A[x  *- 1] 


31- 


3xA 


3E- 


(A| 

n. 


n 

[A] 


n2 

c 


ni 

[B] 

c 


n 

[A] 


L 


n 

B 


2 


n, 

A[x*-lj 

n2ix-tj 

c 


The  reduction  system  given  by  the  above  reduction  rules  lias  the  strong  normalization 
property! I’rawit/jl %')]).  We  have  left  out  the  permutation  rules,  because  they  arc  not 
necessary  for  the  execution  of  proofs. 


2.S  Proof  procedures 


Let  A  =  Vxqp(x)  be  a  closed  non-Harrop  formula.  Suppose  that  one  has  a  mechanical 
procedure  y  which,  when  given  a  vector  of  closed  terms  t,  supplies  a  closed  Harrop  proof  y(t) 
of  the  formula  <p(t).  Such  a  procedure  will  be  called  a  "proof  procedure  for  A".  It  turns  out 
that  the  availability  of  such  a  proof  procedure  makes  it  possible  to  execute  proofs  in  which  A 
is  stated  as  a  lemma.  Iliat  is  to  say,  it  is  not  necess  >ry  for  the  purposes  of  proof  execution  to 
have  a  particular  closed  Harrop  proof  of  a  non-Harrop  universal  formula  Vx<p(x);  it  is 
sufficient  to  have  a  method  for  generating  closed  Harrop  proofs  of  each  closed  instance  <p(t)  of 


V/e  require  a  proof  procedure  y  for  Vx<p(x)  to  supply  a  proof  of  <p(t)  only  under  the 
condition  that  t  is  closed.  Nonetheless,  it  is  convenient  to  allow  a  proof  procedure  to  supply 
(not  necessarily  closed)  proofs  of  <p(t)  for  some  vectors  t  of  terms  which  arc  not  closed, 
depending  on  circumstances.  Thus  we  formally  define  a  proof  procedure  for  Vx<p(x)  to  be  a 
mechanical  procedure  y  with  the  following  properties.  (1)  When  y  is  applied  to  any  vector  of 
terms  t,  it  returns  cither  the  atomic  message  "FAIL",  or  a  Harrop  proof  of  <p(t).  (2)  If  t  is 

composed  of  closed  terms,  then  y(t)  must  he  a  closed  proof,  and  not  the  message  "FAIL". 

The  use  of  proof  procedures  may  be  integrated  into  the  normalization  process  by  adding 
the  following  rule  to  the  class  of  reduction  rules  for  proofs  given  above. 

lemma-reduction: 

lemma:  Vxcp 

VL - 

<p(t) 


Y(i) 

condition:  y  is  the  proof  procedure  for  Vx<p, 
and  y(t)*FAIL 


We  shall  henceforth  use  the  word  "lemma”  as  a  technical  term  which  denotes  a  universal 
formula  for  which  a  proof  procedure  has  been  supplied.  The  set  of  lemmas  together  with  their 
associated  proof  procedures  is  -  like  the  language  L  -  a  parameter  of  the  definition  of  the  class 
of  proofs,  and  of  the  class  of  normalization  reductions.  We  assume  that  the  proofs  generated 
by  proof  procedures  do  not  themselves  make  use  of  lemmas. 

The  addition  of  lemma-reduction  to  the  set  of  reduction  rules  docs  not  interfere  with  the 
strong  normalization  property.  Also,  the  various  properties  of  normal  proofs  which  were  given 
in  section  2..1  continue  to  hold  if  we  arid  the  restriction  that  the  normal  proofs  in  question  be 


closed  Since  reductions  on  proofs  pass  from  dosed  proofs  to  closed  proofs  (section  2.9),  it 
follows  that  closed  proofs  continue  to  have  all  of  the  computationally  useful  properties 
remarked  on  in  section  2.3. 

In  the  example  to  be  given  in  section  2.8,  only  one  lemma  is  used,  namely  the  lemma  Vx 
y  (x<yVx>y)  which  states  the  decidability  of  numerical  inequalities.  The  proof  procedure  for 
this  lemma  simply  provides  the  proof 

VI - 

t,<t2  V  t,>t2 

if  tx  and  t2  arc  closed  and  the  formula  tt<t2  is  true,  and  the  proof 

tt>t2 

VI - 

if  t^  and  t2  are  closed  and  the  formula  t(>t2  is  true;  if  t(  or  t2  contains  a  free  variable,  then 
"KAIL"  is  returned.  Proof  procedures  for  formulas  of  the  form  Vx(R(x)  V  HR(x))  with  R 
atomic  play  a  role  in  normalization  which  corresponds  to  the  role  played  by  primitive 
predicates  in  programming  languages. 


2.6  Reductions  on  terms  of  L 

Suppose  that  one  has  a  reduction  system  <T,R>  for  the  terms  of  a  first  order  language  L. 
Then  the  reductions  R  can  be  incorporated  into  proof  normalization  simply  by  by  allowing 
them  to  be  applied  at  will  to  terms  which  appear  in  the  formulas  of  proofs.  In  such  a  hybrid 
reduction  system  there  is  little  interaction  between  the  reductions  on  terms  and  the  reductions 
on  proofs.  If  both  the  reduction  system  for  terms  and  the  reduction  system  for  proofs  have 
the  termination  property,  then  so  will  the  hybrid  reduction  system.  This  holds  for  the 
uniqueness  property  as  well,  so  long  as  the  proof  procedures  for  non-liarrop  formulas 
commute  with  term  reductions. 

As  an  example,  consider  a  formulation  of  first  order  arithmetic  in  which  terms  arc  built  up 
from  variables,  decimal  (or  binary)  notations  for  natural  numbers,  and  function  symbols  for 
successor,  addition  and  multiplication.  Consider  also  the  reduction  system  consisting  of  the 
single  rule  which  replaces  closed  numerical  terms  by  decimal  notations  for  their  values.  In 
computing  numerical  functions  by  means  of  proof  normalization,  the  use  of  this  term  reduction 


rule  allows  the  addition  and  multiplication  of  numbers  to  be  carried  out  by  efficient 
machinery  external  to  the  normalization  procedure.  In  particular,  the  rule  can  be  implemented 
in  such  a  way  as  to  take  advantage  of  the  arithmetic  hardware  possessed  by  most  computers. 

Reductions  on  terms  will  receive  little  explicit  attention  in  the  rest  of  this  thesis.  However, 
the  presence  of  a  well-behaved  reduction  system  for  terms  will  not  affect  any  of  the  results 
about  proof  normalization  presented  in  this  chapter  or  in  chapter  3.  By  "well-behaved",  we 
mean  (1)  terminating,  and  (2)  value-preserving  with  respect  to  the  model  (if  any)  currently 
under  consideration  Whenever  reductions  on  terms  arc  mentioned,  the  reader  is  to  assume 
that  properties  (1)  and  (2)  hold. 


2.7  Pruning 


The  pruning  i 

operations 

are  as  follows. 

nt 

n2 

n3 

AVB 

C 

c 

n2 

if  A  docs  not  appear  as  an 

VK - 

— 

C 

open  assumption  in  ll2. 

c 

n. 

n2 

n3 

AVB 

c 

c 

=> 

n3 

if  B  does  not  appear  as  an 

VK - 

c 

open  assumption  in  n3. 

c 

There  i 

s  also 

a  pruning 

operation: 

n , 

fl2 

3xA 

c 

3K - — 

— 

=> 

”2 

if  A  docs  not  appear  as  an 

C 

c 

open  assumptioon  in  li2 

for  the  3-elimination  inference,  but  it  will  play  no  role  in  the  work  described  in  this  thesis. 
Henceforth  when  we  speak  of  a  "pruning  operation"  we  mean  one  of  the  two  pruning 
operations  for  V-elimination. 


It  should  now  be  clear  why  the  pruning  operations  arc  unlikely  to  be  useful  when  applied 
to  proofs  as  originally  given  by  people.  The  inferences  removed  by  pruning  are  redundant, 
and  one  docs  not  expect  to  find  them  in  proofs  which  have  been  constructed  in  a  purposeful 
way.  However,  the  example  given  in  the  next  section  demonstrates  that  proofs  which  result 
from  simple  automatic  processes  may  contain  such  redundancies;  in  particular  the  process  of 
specializing  a  proof  by  normalization  may  introduce  redundancies  where  none  had  at  first 
appeared. 

The  pruning  operations  may  be  adjoined  to  the  set  of  reductions  used  in  normalization. 
The  resulting  reduction  system  retains  the  termination  property,  although  the  uniqueness 
property  is  lost.  This  loss  of  uniqueness  is  an  advantage  and  not  a  defect  of  pnining.  Pmning 
allows  us  to  reduce  proofs  to  a  variety  of  equally  satisfactory  normal  forms,  some  of  which  can 
be  arrived  at  more  quickly  than  the  normal  form  which  results  from  normalization  without 
pruning.  Thus,  by  dropping  the  uniquenccss  requirement,  we  gain  efficiency. 


2.8  An  example 

The  simplest  algorithms  to  which  the  pruning  operations  arc  usefully  applicable  arc  pure 
ease  analysis  algorithms  -  algorithms  which  can  be  expressed  by  "plain"  conditional 
expressions.  In  what  follows,  we  present  a  very  small  case  analysis  algorithm  which  is 
nonetheless  sufficient  to  illustrate  the  main  points  which  we  wish  to  make  about  pruning. 
These  points  arc;  (1)  pruning  may  be  used  to  increase  the  efficiency  of  specializations  of 
algorithms,  and  (2)  conventional  descriptions  of  algorithms  do  not  contain  the  data  necessary 
for  the  improvements  in  efficiency  realized  by  pruning.  Consider,  then,  the  following 
algorithm  -  given  as  a  conditional  expression  -  for  computing  an  upper  bound  for  both  the 
sum  and  the  product  of  two  positive  rational  numbers  x  and  y: 

u(x,y)=if  x<l  then  y+1  else  (if  y<l  then  x+l  else  2xy) 

We  will  use  the  bold  faced  letter  u  to  refer  both  to  the  algorithm,  considered  as  an 
abstract  method  which  can  be  formalized  in  various  ways,  and  to  the  above  concrete 
conditional  term. 

Now,  suppose  that  the  value  .5  is  given  for  y  in  advance,  and  that  we  wish  to  optimize  u 
given  this  additional  information.  The  best  we  can  do,  if  supplied  only  with  the  conditional 
expression  as  a  description  of  the  algorithm,  is  to  symbolically  execute  the  expression  on  the 
arguments  x,  .5.  The  result  is: 

u(x,.5)=if  x<t  then  1.5  else  xfl 
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As  will  be  seen  below,  the  formalization  of  this  upper  bound  algorithm  as  a  proof  allows 
u(x,.5)  to  be  automatically  simplified,  by  use  of  normalization  and  pruning,  to  the  expression 
x  +  1.  The  fact  that  x  +  1  is  an  upper  bound  for  both  x  +  ,5  and  .5x  does  not  depend  on  x 
being  less  than  or  equal  to  one;  this  dependency  information  is  contained  in  the  proof,  and 
allows  the  automatic  removal  of  the  unnecessary  case  split  according  to  the  size  of  x.  Note  that 
the  pruning  optimization  has  the  unusual  quality  that  it  modifies  the  function  computed  by  the 
expression  to  which  it  is  applied.  However,  pruning  is  guaranteed  to  preserve  the  validity  t.' 
an  algorithm  for  the  specification  embodied  in  the  end-formula  of  the  proof  describing  the 
algorithm.  Also  note  that  no  transformation  on  conventional  computational  descriptions  can 
have  the  same  effect  as  pnining.  Conventional  descriptions  contain  information  only  about 
the  function  to  be  computed,  and  not  about  the  purpose  of  the  computation,  and  therefore 
valid  transformations  on  such  descriptions  must  -  unlike  pruning  -  preserve  cxtcnSional 
meaning. 

The  following  natural  deduction  proof  formalizes  the  upper  bound  algorithm  u.  In  the 
proof  and  elsewhere  ^(x.y.z)  is  used  to  abbreviate  the  formula  (z  >  x  +  y)  A  (z  >  xy). 
Leaves  of  the  proof  tree  which  are  not  surrounded  by  brackets  designate  axioms  or  lemmas. 
Three  Harrop  axioms  ("x<lD'I,(x,y,y+ 1)",  "y<lD'L(x,y,x+ 1)”,  and 
"(x>l)A(y>l)D'l>(x,y,2xy)")  and  one  lemma  Vx  y(x<y  V  y<x),  appear  in  the  proof.  We 
assume  that  the  proof  procedure  described  in  section  2.5  above  has  been  provided  for  the 
lemma.  Also,  reduction  rules  for  numerical  terms,  which  will,  for  example,  reduce  2+1  to  3, 
are  assumed  to  be  present.  (The  details  of  the  notation  used  for  rational  numbers  and  of  the 
reductions  which  apply  to  numerical  terms  arc  unimportant  for  the  purposes  of  the  current 
discussion.)  Wc  will  use  the  capital  letter  U  to  designate  the  proof. 


[y<l]  y<lDSk(x,y,x+l) 
DE - 

'{'(x.y.x  + 1) 

Vxy(x<yVy<x) 

VE - 31 - 

y<lVy>l  3z*(x,y,z) 


[x>l]  [y>l] 

A I - 

x>lAy>l  (x>l)A(y>l)D4'(x,y,2xy) 


'('(x.y^xy) 

31 - 

3z'p(x,y,z) 


3z,F(x,y,z) 


[x<l]  x<lD'J,(x,y,y+ 1) 
DE - 

'K*.y.y+i) 

Vxy(x<yVy<x) 

VE -  31 - 


x<1Vx>1  3z'Kx.y,z) 


VE- 


3z'Kx.y,z) 


Note  that  wc  have  neglected  to  universally  quantify  the  variables  x.y  so  as  to  arrive  at  a 
proof  in  the  standard  V3  form.  In  the  current  simple  context  it  is  more  convenient  for 
purposes  of  exposition  to  leave  the  quantification  implicit,  and  to  specify  that  input  values  to 
the  proof  viewed  as  an  algorithm  be  substituted  for  the  free  variables.  More  precisely,  in 
order  to  compute  an  upper  bound  for  the  sum  and  product  of  two  input  values  Vj  and  v2  by 
means  of  normalization,  v,  and  v;  arc  first  substituted  for  x,y  throughout  the  proof  U,  and 
then  the  proof  is  normalized. 

Normalization  of  simple  ease  analysis  proofs  such  as  U  makes  use  only  of  the  V-rcduction 
mlcs  (section  2.2)  and  perhaps  of  proof  procedures  for  lemmas.  In  this  restricted  ease, 
normalization  of  proofs  corresponds  closely  to  the  execution  of  conditional  terms  by  means  of 
repeated  applications  of  the  reduction  rules: 

C,:  (if  TRUE  then  t(  else  t2)  =>  t[ 

C2:  (if  FALSE  then  t(  else  t2)  =>  t2 
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1 


A,:  ROptj  .  . 

•  ln> 

=>  TRUE 

if  R  is  an  atomic  relation,  tj.tj  .  . 

.  tn  arc  closed 

ground  terms,  and  Rftj.tj  .  .  . 

tn)  holds 

A2:  R(t1,t2  .  . 

.  .t„) 

FAI.SH 

if  R  is  an  atomic  relation,  t1,t2  . 

.  .  tn  are  closed 

ground  terms,  and  R(tltt2  .  .  .  tn)  docs  not  hold 


The  two  V-rcduction  rules  correspond  in  their  effect  to  C,  and  C'2,  while  proof-procedures 
for  lemmas  of  the  form  Vx^Xj  .  .  .  xn(R(*i.*2  .  .  .  xn)  V  1R(Xj,x2  .  .  .  xn))  correspond  to 
the  rules  Aj  and  A2. 

More  specifically,  the  V-rcduction  rule  takes  an  V -elimination  inference 

n 1  n2  nj 

AVB  C  C 

VE - 

C 

in  which  the  proof  ll(  of  the  first  premise  indicates  which  one  of  A  and  B  is  true;  defending 
on  whether  it  is  A  or  II  that  holds,  either  the  second  "branch"  112  or  the  third  "branch"  ri3  of 
the  inference  is  selected.  T  his  corresponds  to  making  use  of  a  binary  decision  between  TRUE 
and  FALSE  in  a  conditional  expression  to  select  a  branch  of  the  conditional. 

As  an  example,  the  reader  may  wish  to  carry  out  the  normalization  of  U  when  inputs  2 
and  .5  arc  substituted  for  x  and  y,  respectively.  The  normalization  of  the  proof  will  parallel 
the  normalization  of  the  tcim 

if  2<l  then  .5+1  else  (if  ,5<1  then  2+1  else  2(2)(.5» 

with  respect  to  the  reduction  rules  Cpf^A,,.^  given  above.  ’I  he  final  result  of  the 

normalization  will  be: 

,5<1  ,5<1  D  *(2,  .5,  3) 

DL - - - - 

'k(2,  .5,  3) 

31 - - 

3/T(2,  .5,  /.) 

The  value  returned  by  this  proof  is  "3”. 

In  order  to  specialize  the  algorithm  expressed  by  IJ  to  the  case  where  y  is  fixed  at  .5,  .5  is 
substituted  for  y  throughout  the  proof,  and  the  result  is  normalized.  This  process  yields  the 
following  "specialized”  proof: 
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DH- 

Vxy(x<yVy<x) 
x<l  Vx>l 


x^lD^x,  .5.  1.5) 

*(x,  .5,  1.5) 

II - 

3z4'(x,  .5.  /.) 

3z^(x,  .5,  z) 


.5<ir>^(x,  .5.  x  +  1) 

*(x,  .5.  x  +  1) 

3! - 

3z^(x,  .5,  z) 


This  proof  corresponds  to  the  specialized  conditional  term,  "if  x<l  then  1.5  else  x  +  1".  A 
further  optimization  is  applicable  to  the  specialized  proof  which  is  not  applicable  to  the 
conditional  term,  namely  pruning.  The  second  minor  premise  of  the  V -elimination  inference 
in  the  specialized  proof  above  docs  not  depend  on  the  assumption  x>l.  It  is  this  fact  about  the 
dependency  structure  of  the  computation  drat  the  proof  U,  but  not  the  conditional  term  u, 
formalizes,  and  which  allows  pruning  to  take  place.  The  result  of  applying  pruning  is: 

.5<  1  .SClD'Hx,  .5,  x+1) 

Dp; - ~ - 

*(x,  .5,  x  +  1) 

31 — - - - - - 

3z+(x,  .5,  z) 

This  represents  the  same  algorithm  as  the  conditional  term  "x  +  1". 

Note  that,  if  comparison  is  a  very  cheap  operation,  and  adding  is  very  expensive,  then  it 
might  happen  that  "x  +  1"  has  an  average  ease  efficiency  which  is  worse  than  "if  x<l  then  1.5 
else  x+1".  This  illustrates  the  general  point  that  pruning  is  not  guaranteed  to  increase 
efficiency.  However,  pruning  often  improves  the  efficiency  of  an  algorithm,  and  always 
reduces  its  size.  (Size  reduction  is  an  important  effect  of  pruning  in  the  experiments  on  bin¬ 
packing;  see  chapter  4) 


2.9  Summary:  conditions  for  the  computational  usefulness  of  proofs 

In  what  follows,  we  collect  together  the  various  results  and  conditions  which  arc  relevant 
to  the  usefulness  of  proofs  for  computation,  and  explicitly  describe  the  relationships  between 
them.  I'irst  of  all,  we  have  the  results  about  the  reduction  rules  involved  in  normalization: 

(la)  Syntactic  validity  of  the  reduction  rules  for  proofs  (given  in  section  .’.4)  and  of  pruning: 
each  of  these  operations  yields  a  well-formed  proof  when  applied  to  a  well-formed  proof. 

(lb)  Preservation  of  the  end-formula:  the  reduction  rules  do  not  modify  the  end-formula  of  a 
proof. 
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(lc)  Termination:  every  sequence  of  applications  of  reduction  rules  to  a  proof  terminates. 

(ld)  Preservation  of  "elosedness":  a  reduction  rule  yields  a  closed  proof  when  applied  to  a 
closed  proof. 

Second  there  is  the  result  concerning  the  normal  form  (sections  2.3,  2.5): 

(2)  A  normal,  closed,  Harrop  proof  of  3xA  has  the  form, 

n 

A(t) 

31 - 

3xA(x) 

All  of  the  above  results  arc  purely  syntactic  in  nature.  No  mention  is  made  of  the 
meaning  of  the  formulas  which  appear  in  proofs.  However,  we  have, 

(3)  The  inference  rules  of  natural  deduction  e  sound  with  respect  to  the  usual  Tarskian 
semantics. 

1'hc  inference  rules  arc  also  sound  for  the  intuitionistic  notion  of  validity.  As  a 
consequence,  each  of  the  remarks  made  below  will  continue  to  hold  if  the  words  truth  and 
validity  arc  taken  to  refer  to  the  intuitionistic  rather  than  the  classical  notions. 

The  final  result  which  guarantees  the  possibility  of  executing  proofs  of  V3  formulas  is: 

(4)  If  II  is  a  proof  of  3xA(x)  meeting  certain  conditions,  then  the  normalization  procedure 
terminates  when  applied  to  11,  and  results  in  a  proof  having  the  form, 

n 

Aft) 

31 - 

3xA(x) 

where  A(t)  is  true  (in  some  intended  model). 

The  conditions  for  the  result  (4)  arc:  (a)  the  proof  must  be  closed,  (b)  all  axioms 
appearing  in  the  proof  must  be  Marrop  formulas,  (c)  all  axioms  appearing  in  the  proof  must 
be  true,  and  (d)  the  axioms  which  appear  in  proofs  generated  by  proof  procedures  must  be 
true. 

The  proof  of  result  (4)  from  the  various  results  under  (1).  (2).  (3)  above  is  as  follows:  If  11  is  a 
proof  of  3xA(x)  meeting  the  conditions  (a)-(d)  of  (4),  then 

°  normalization  terminates  on  II  by  (lc), 
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and  yields  a  proof  in  the  form, 


3xA(x) 


by  conditions  (a),(b)  and  results  (la),(lb),(ld),(2); 

°  finally  A(t)  is  true  by  result  (3)  and  conditions  (c)  anc  (d). 

We  wish  to  emphasize  the  degree  to  which  the  various  results  and  conditions  which  come 
into  the  proof  of  (4)  arc  independent.  In  particular,  none  of  the  results  under  (1)  and  (2) 
depends  in  any  way  on  the  truth  of  the  axioms  which  appear  in  proofs.  Thus  syntactic  and 
semantic  considerations  do  not  interact  and  can  be  examined  separately. 
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Chapter  3 


Efficient  Implementation  of  Operations  on  Proofs 

The  normalization  and  pruning  operations  described  in  tire  last  chapter  are  quite 
inefficient  if  implemented  in  a  literal  minded  way.  The  problem  is  not  so  much  that  the 
asymptotic  efficiency  of  an  algorithm  is  degraded  if  it  is  formalized  as  a  proof,  but  rather  that 
the  elementary  operations  which  arc  used  in  normalization  arc  computationally  expensive. 
For  example,  the  substitution  of  a  proof  for  occurences  of  an  assumption  is  an  expensive 
operation,  both  in  time  and  space. 

I  lowcver,  as  we  will  show  in  this  chapter,  normalization  and  pruning  can  be  implemented 
in  an  efficient  manner  if  an  appropriate  data  structure  for  proofs  is  used.  Specifically,  we 
will  represent  natural  deduction  proofs  by  terms  of  an  extended  A -calculus.  Hie 
normalization  of  such  A-caluclus  terms  can  be  implemented  efficiently  by  using  environments 
instead  of  literal  substitutions,  as  is  done  in  interpreters  for  A-calculus  based  languages  such  as 
I. ISP. 

In  section  3.1,  wc  describe  the  connection  between  the  natural  deduction  formalism  and 
the  typed  A-calculus.  Hmphasis  is  placed  on  pure  implicational  logic,  where  the  connection  is 
most  direct.  In  sections  3.2  -  3.4  wc  present  a  A-calculus  based  representation  for  natural 
deduction  proofs  of  full  predicate  logic.  Sections  3.5  and  3.6  concern  the  manner  in  which 
normalization  and  pruning  operations  apply  to  this  representation.  In  section  3.7  wc  describe 
an  additional  reduction  rule  used  in  the  experiments  of  chapter  4  -  namely,  the  permutation 
rule  for  V -elimination,  in  section  3.8,  schematic  examples  arc  presented  which  illustrate  the 
effect  that  pruning  can  have  on  the  computational  efficiency  of  proofs. 

3.1  Natural  deduction  and  the  typed  A-calculus 

The  close  structural  correspondence  between  natural  deduction  proofs  and  terms  of  the 
typed  A-calculus  has  been  known  for  some  time,  and  forms  the  basis  for  the  calculi  of 
constructions  developed  by  Scotl[l')70|.  I  toward]  1980],  l)cl)rujin[1970|,  Mar(in-I.ol[1979],  and 
others.  (The  calculus  which  is  closest  to  our  own  "p-calculus"  [section  3.2]  is  Martin- 
I  .til's]  1 979]  theory  of  types.)  The  central  idea  here  is  that  the  same  elementary  operations  may 
be  used  in  (1)  constructing  and  applying  general  methods  of  computation,  and  in  (2) 
establishing  and  applying  general  truths.  As  an  example,  consider  (a)  a  term  t(x)  of  the  typed 
A-calculus  in  which  (only)  the  variable  x  appears  free,  (b)  a  proof  M  of  a  formula  13  in  wdiich 
(only)  the  formula  A  appears  as  an  open  assumption.  In  both  the  cases  (a)  and  (b),  one  has  an 
incompletely  given  construct;  t  does  not  denote  any  particular  object,  but  will  do  so  once  a 


concrete  value  has  been  supplied  for  x  and  substituted  into  t;  similarly,  H  docs  not  establish 
the  truth  of  B,  but  will  do  so  when  any  proof  of  A  is  given  and  substituted  for  occurences  of 
the  assumption.  'ITuis,  in  both  cases,  the  incomplete  construct  in  question  supplies  a  general 
method  for  passing  from  a  value  (for  the  variable  x  or  the  assumption  A)  to  a  result  (of 
substitution).  One  may  apply  the  operation  of  abstraction  to  the  incomplete  construct  so  as  to 
arrive  at  a  term  or  proof  which  describes  this  general  method.  In  case  (a)  the  abstraction  is 
written,  "Ax..1,  while  in  case  (b)  the  abstraction  is  the  proof, 

'n1 

B 

Dl - 

AD  B 

One  also  has  the  converse  operation  at  one’s  disposal,  namely,  application.  If  one  has  a 
term  t,  which  describes  a  general  method,  and  a  term  t-,  of  the  appropriate  type,  then  one 
may  form  the  term  "t.(t,)",  which  denotes  the  result  of  applying  the  general  method  t(  to  the 
input  tj.  Similarly,  if  one  has  a  proof  n,  of  A  D  B  -  that  is  to  say,  a  general  method  for 
getting  from  proofs  of  A  to  proofs  of  B  -  and  also  a  particular  proof  fl2  of  A,  then  one  may 
form  a  proof  which  denotes  the  result  of  applying  n L  to  11-,.  That  proof  is: 
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Thus,  the  constructor  "A"  which  is  used  in  building  up  A-terms  corresponds  to  the 
inference  rule  Dl,  while  the  constructor  for  application:  ^(tj)  corresponds  to  the  inference 
mle  DH. 

In  both  the  A -calculus  and  the  formalism  of  natural  deduction  proofs,  normalization 
involves  applying  general  methods  (as  described  by  abstractions)  to  given  inputs.  Specifically, 
the  ft  -conversion  rule  for  the  A -calculus  reduces  an  application  (Ax.t  ()(t,)  of  an  abstraction 
(Ax.t[)  to  an  input  t2  to  the  term  t^x*^.  The  corresponding  reduction  for  proofs  is  just 
implication  reduction: 
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For  natural  deduction  proofs  of  pure  implication;!!  logic,  the  correspondence  to  the  typed 
X -calculus  is  exact;  any  such  proof  may  rewritten  as  a  X -calculus  term  by  (1)  replacing 
assumptions  by  variables,  (2)  replacing  each  D-introduction  inference  by  a  X-abstraction  of 
the  variable  corresponding  to  the  assumption  discharged  by  the  inference,  and  (3)  replacing 
D-eliminations  by  applications.  This  change  of  notation  from  proof  to  X-caluclus  language 
results  in  no  loss  of  information,  and  furthermore,  the  D-rcduction  operation  on  proofs  is 
thereby  mapped  directly  onto  the  ft  -conversion  open,. ion  on  X -calculus  terms.  The  particulars 
of  this  "change  of  notation"  are  as  follows. 

First  we  present  a  formal  definition  of  the  typed  X-calculus.  We  start  with  a  collection  of 
symbols  x(,.  .  .  xn  called  the  "base  types".  Complex  types  are  built  up  from  the  base  types 
t,„  .  .  t(1  by  the  binary  constructor  the  inductive  definition  is:  (1)  each  is  a  type;  (2) 

if  r,  p  are  types  then  so  is  "r  —»  p"  .  The  base  types  are  intended  to  denote  sets  of 
"primitive"  objects,  while  x  — ►  p  is  intended  to  denote  the  set  of  mappings  from  objects  of 
type  x  to  objects  of  type  p.  Next,  we  assume  that  an  infinite  set  VT  of  variables  is  given  for 
each  type  x.  The  elements  of  VT  arc  called  "variables  of  type  r".  VT  and  Vp  arc  assumed  to 
be  disjoint  for  distinct  types  x  and  p.  The  following  inductive  clauses  define  the  notion  of  a 
term  of  type  r. 

(1)  each  variable  vT  of  type  r  is  a  term  of  type  r. 

(2)  If  t  is  of  type  r  and  x  is  a  variable  of  type  p  then  Xx.t  is  a  term  of  type  p  -»  r. 

(3)  If  t(  is  of  type  r  -»  p  and  t2  is  of  type  r,  then  t|(t;)  is  a  term  of  type  p. 

My  "pure  implicational  logic"  is  meant  the  restricted  natural  deduction  system  in  which 
formulas  are  built  up  from  propositional  constants  by  use  of  implication  alone,  and  in  whose 
proofs  only  the  DF  and  Dl  inferences  appear.  The  formulas  which  appear  in  proofs 
correspond  to  the  types  of  X-terms;  a  propositional  constant  P  corresponds  to  a  base  type  Xp, 
while  a  formula  ADM  corresponds  to  a  type  xA  -*  xjV  More  precisely,  we  assign  to  each 
implicational  formula  A  a  type  xA  according  to  the  following  rules.  (1)  Kach  propositional 
constant  P  is  assigned  a  base  type  Xp.  (2)  If  the  formula  A  has  been  assigned  the  type  xA, 
and  the  formula  M  has  been  assigned  the  type  x(!,  then  the  formula  A  D  M  is  assigned  the 
type  xA  -  xB. 

We  now  define  the  map  I'  which  rewrites  proofs  as  X-terms.  It  is  assumed  to  start  with 
that  variables  of  appropriate  types  have  been  selected  for  labeling  formulas;  we  assume,  that  is 
to  say,  that  a  unique  variable  vA  of  type  xA  has  been  assigned  to  each  formula  A.  P  is 
defined  by  induction  on  the  structure  of  proofs.  We  use  the  notation  P:  II  =>  t  to  indicate 
that  the  value  of  P  applied  to  II  is  t. 


(That  is,  T  when  applied  to  a  proof  which  consists  simply  of  an  assumption  [A]  yields  the 
variable  vA  which  labels  the  formula  A.) 
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For  example,  the  proof 
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when  written  in  X-c.dculus  notation  yields  the  term 

Xv|\D<  \Dh:|-  Av,v  !(v|aD(aDid|(va))(va» 
of  type  -  T|(AD(AD|1))D(AD|t)j. 

Notice  that,  for  any  proof  II  with  endlbnmila  A,  the  type  of  the  A-caluclus  notation  l’(ll) 
for  that  proof  is  rA-  Similarly,  the  types  of  the  subterms  of  1(11)  correspond  to  the 
endfommlas  of  the  stibproofs  from  which  those  .subterms  arise. 


What  we  have  done  so  far  is  to  show  that  natural  deduction  proofs  of  a  restricted  system 


can  he  represented  as  X-calculus  terms.  It  is  possible  to  represent  any  natural  deduction 
proof  in  the  same  style,  under  the  condition  that  appropriate  additional  contructors  arc 
adjoined  to  the  X-calculus.  Hut  before  going  on  to  describe  the  X-calculus  formulation  of  full 
predicate  calculus,  it  worthwhile  looking  more  closely  at  the  differences  between  the  proof 
notation  and  the  X-calculus  notation  for  proofs  of  pure  implicational  logic. 

Both  proofs  and  X-terms  may  be  regarded  as  labeled  trees:  proof  trees  arc  labeled  by 
formulas  and  inference  rule  names,  and  "X-trees"  by  variables  (at  leaves)  and  construction 
rule  names  (at  interior  nodes).  From  this  point  of  view  the  difference  between  proof  notation 
and  X-calulcus  notation  lies  in  the  choice  of  information  which  is  explicitly  stored  on  the  tree. 
In  proofs,  a  formula  is  stored  at  every  node.  In  a  X-term,  the  corresponding  type  information 
is  associated  only  with  the  variables  which  appear  at  the  leaves  of  die  tree,  and  must  be 
computed  for  other  nodes.  In  proofs,  the  connection  between  inference  rules  and  the  sets  of 
assumptions  which  they  discharge  must  be  derived  from  "type  information"  (ic  formulas  on 
the  tree).  In  X-terms,  this  information  is  ••'presented  more  explicitly:  the  discharged 
assumptions  are  labeled  by  a  bound  variable. 

Suppose  that  all  type  information  is  dropped  from  a  X-term  -  that  the  typed  variables  arc 
replaced  one  for  one  by  variables  with  which  no  type  information  is  associated.  Then  the 
resulting  untyped  X-term  represents  the  "logical  structure"  of  a  proof,  in  the  following  sense. 
The  underlying  tree  of  the  untyped  term  records  a  sequence  of  applications  of  inference  rules 
(in  X-calculus  notation),  and  also  describes  the  graph  of  connections  between  inference  rules 
and  the  assumptions  (represented  by  variables)  which  they  discharge.  Thus  if  one  were  to 
take  a  proof  tree,  and  strip  off  the  formulas  which  appear  on  the  tree,  while  retaining  a  record 
of  the  "logical  structure”  of  the  proof,  then  the  result  would  contain  the  same  information  as 
an  untyped  X-expression.  (  The  logical  structure  of  proofs  in  the  current  sense  is  exactly  the 
structure  preserved  by  isomorphisms  between  proofs  in  the  sense  of  Statman[1974].) 

Now.  notice  that  the  normalization  reductions  of  the  X-ealuclus  make  no  use  of  type 
information;  if  one  wishes  to  normalize  a  typed  X-calulcus  term,  one  is  free  to  throw  away  the 
types  before  doing  the  normalization,  and  the  result  will  be  no  different.  Correspondingly, 
the  sequence  of  steps  taken  in  the  normalization  of  a  proof  depends  only  on  the  "logical 
structure"  of  the  proof  in  the  sense  of  the  last  paragraph.  Two  proof  trees  on  which  different 
formulas  appear  will  be  subjected  to  the  same  sequence  of  reduction  steps  by  normalization, 
so  long  as  the  inference  rules  and  the  structure  of  discharges  of  assumptions  on  the  two  trees 
are  the  same. 

When  we  consider  X-calculus  notation  for  arbitrary  natural  deduction  proofs,  it  will  be 
seen  that  once  again  type  information  is  not  necessary  for  normalization.  Furthermore, 
untyped  tonus  contain  the  desired  output  of  computations,  and  can  be  subjected  to  pruning. 


'['hat  is  to  say,  the  "logical  structure"  of  a  proof  as  expressed  by  an  untyped  X-term  is 
sufficient  not  only  to  determine  the  form  of  the  normalization  sequence,  but  also  to  determine 
the  output  value  which  is  extracted  from  a  normal  proof,  and  to  allow  pruning  to  take  place. 
Thus,  for  practical  purposes,  it  is  always  sufficient  to  deal  with  the  untyped  variants  of  proofs. 

The  following  remarks  summarize  the  interest  of  using  a  X -calculus  based  notation  for 
proofs. 

(t)  The  untyped  variant  of  the  X- calculus  notation  for  a  proof  contains  exactly  that 
information  which  is  relevant  to  the  execution  and  pruning  of  the  proof. 

(2)  An  efficient  technology  exists  for  normalization  of  (proofs  expressed  as)  A-calculus 
terms. 


3.2  The  p-calculiis 

In  order  to  arrive  at  a  notation  of  the  kind  discussed  in  the  last  section  which  is  adequate 
for  arbitrary  natural  deduction  proofs,  new  constructors  for  the  inference  rules  other  than  D- 
introduction  and  D-climination  arc  added  to  the  X-calculus,  namely  :  (1)  pairing  (for  A- 
introduction),  (2)  impairing  (for  A -elimination),  (3)  01,  and  0I-,  (for  V-introduction),  (3)  OH 
(for  V -elimination),  (4)  HI  (for  3-introduction),  and  (5)  HH  (for  3 -elimination).  V- 
introduction  and  V-elimination  arc  treated  using  the  "old"  constructors  X -abstraction  and 
application.  1'he  extended  system  just  described  will  be  referred  to  as  the  "p-calculus". 

We  will  have  occasion  to  deal  with  both  a  typed  and  an  untyped  variant  of  the  p-calculus. 
The  relationship  between  proofs,  typed  terms,  and  untyped  terms  is  the  same  for  the  p- 
calculus  as  it  is  for  the  "plain”  X -calculus.  Namely,  a  typed  term  of  the  p-calculus  constitutes 
a  complete  representation  of  a  proof,  while  an  untyped  term  serves  to  express  only  that 
information  in  a  proof  which  is  needed  for  execution  and  pruning. 

I'hc  "types”  which  will  be  assigned  to  terms  of  the  typed  p-calculus  will  not  be  types  in 
the  ordinary  sense;  rather,  they  will  be  formulas  of  first  order  logic.  The  connection  between 
formulas  and  types  given  in  the  last  section  for  implicational  logic  can  be  extended  to  the  p- 
calculus  treatment  of  full  litsi  order  logic;  it  is  possible  to  assign  types  of  the  ordinary  kind 
(ic  classes  of  functions)  to  arbitrary  first  order  formulas,  and  to  assign  functions  to  p-calculus 
terms,  in  such  a  way  that  the  two  assignments  are  consistent.  Specifically,  a  term  of  "type"  <p 
will  denote  a  function  which  actually  belongs  to  the  type  assigned  to  <p.  However,  none  of 
the  results  which  will  concern  us  here  depend  on  the  details  of  such  assignments,  or  indeed  on 
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such  assignments  being  possible  at  all.  The  reader  will  find  further  information  on  formulas 
as  types  in  Scott(1970],  and  Howard[1980], 

We  define  the  untyped  variant  of  the  p-calculus  as  follows.  The  starting  point  for  the 
definition  is  (1)  an  infinite  set  V  of  variables.  (2)  a  first  order  language  L,  as  described  in 
section  2.1,  (3)  the  special  symbol  #,  and  (4)  a  set  13  of  "defined  symbols"  with  associated 
aritics.  It  is  assumed  that  the  variables  V  and  the  variables  of  1,  arc  distinct.  The  variables  of 
1.  arc  called  "object  variables”,  while  the  variables  in  V  arc  called  "proof  variables".  The 
defined  symbols  D  will  be  used  as  labels  of  proof  procedures  for  lemmas,  and  in  recursive 
definitions  (section  3.4)  as  well.  The  letters  "a,  /)",  "f,  g,  h"  and  "x,  y,  z"  will  be  used  to 
designate  proof  variables,  defined  symbols,  and  object  variables,  respectively.  The  p-calculus 
P,  over  I.,  then,  is  defined  by  the  following  inductive  clauses.  The  phrase  "p-term"  is  taken 
to  designate  an  element  of  PL. 

(1)  The  terms  and  atomic  formulas  of  L  arc  p-terms  (see  section  2.1). 

(2)  The  proof  variables  V  arc  p-terms. 

(3)  The  special  constant  #  is  a  p-term. 

(4)  The  defined  symbols  13  are  p-terms. 

(4)  If  t,,t2  are  p-terms,  then  so  is  <t(,t2>  (pairing). 

(5)  If  t  is  a  p-terms  then  so  arc  7r,(t),  7r2(t)  [unpairing]. 

((>)  If  a  is  a  variable,  and  t  is  a  p-term,  then  Xa.t  is  a  p-term.  [proof-abstraction] 

(7)  If  X|  Xj  .  .  .  x  arc  variables,  and  t  is  a  p-term,  then  Xxt  x2  .  .  .  xn.t  is  a  p-term. 

[object-abstraction] 

(8)  If  t( ,t?  are  p-terms  then  so  is  tj(t2).  [Application] 

(9)  If  a  is  a  proof  variable,  and  t,,t2.tj  arc  p-terms  then  so  is 

(10)  If  «  is  a  proof  variable,  x  an  object  variable,  and  t,,t2  arc  p-terms,  then  HH(x,a,tj,t2) 

is  a  p-term. 
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Note  that,  in  the  ease  of  object  variables,  we  have  chosen  to  introduce  X-abstraction  of 
arbitrary  arity  as  a  primitive  constructor  rather  than  using  "Currying".  This  simplifies  the 
correspondence  between  X-abstraction  and  V-introduction.  Note  that  Xx(  x2  .  .  .  xn.t  is  not 
an  abbreviation  for  Xx^  Xx2.  .  .  Xxn.t. 

The  above  definition  is  given  in  terms  of  ordinary  syntax  suitable  for  written 
presentation.  However,  in  our  discussions  of  formal  operations  on  tenns,  we  will  treat  p- 
caleulus  terms  as  labeled  trees,  as  was  done  in  the  discussion  of  the  X-calculus  in  the  last 
section. 

There  are  several  ways  in  which  one  can  go  about  representing  terms  by  labeled  trees, 
and  the  details  of  how  this  is  done  are  not  of  any  fundamental  importance.  However,  in  order 
to  avoid  confusion  later,  it  is  worthwhile  deciding  here  on  a  specific  representation.  That 
representation  is  as  follows.  The  relationship  between  a  term  and  its  immediate  subterms  is 
coded  directly  in  the  structure  of  the  tree  -  each  node  represents  a  term,  and  the  sons  of  the 
node  represent  the  immediate  subterms  of  that  term,  l  eaf  nodes  are  labeled  by  atomic 
symbols  -  proof  variables,  # ,  and  symbols  of  I..  I'acli  non-leaf  node  is  labeled  by  the 
constructor  used  for  arriving  at  the  current  term  from  its  immediate  subterms,  and  by  the 
variables  which  arc  bound  by  that  constructor.  The  constructor  which  appears  at  the  root 
node  of  any  term  is  referred  to  as  the  "main  constructor"  of  dial  term.  The  constructors  arc: 
PAIR,  APIM.V,  Wj,  772,  Ol,,  01,,  Oli.  X.  I  I,  IT.  In  the  typed  variant  of  the  p-calculus, 
nodes  may  be  labeled  by  formulas  as  well.  Note  that  the  wtnables  hound  by  a  constructor  - 
for  example  the  "x"  in  Xx.t  or  the  "<*"  and  "x"  in  IT(x,tt.t|.t2)  -  are  not  regarded  as 
subterms,  but  as  a  part  of  the  information  with  which  nodes  of  the  tree  are  labeled. 

In  what  follows,  the  notation  ”A(B)"  for  application  is  used  in  three  different  ways.  (1) 
When  A  and  B  are  p-calculus  terms,  A(B)  denotes  the  p-ierm  whose  main  constructor  is 
APPLY  and  whose  immediate  subterms  are  A  and  B.  (?)  When  A  is  a  constructor  (such  as 
7T|)  and  B  is  a  p-term,  then  A(B)  designates  the  result  of  applying  the  constructor  to  the  term 
B.  that  is  to  say,  A(B)  designates  the  p-term  whose  main  constructor  is  A  and  whose 
immediate  subterm  is  B.  (3)  If  A  is  an  operation  on  p-terms  and  B  is  a  p-term,  then  A(B)  will 
denote  the  result  of  applying  A  to  B.  Thus  the  notation  A(B)  senes  both  as  an  external 
syntax  for  a  formal  p-term  whose  main  constructor  is  APPI  Y.  and  to  denote  the  "actual" 
application  of  an  operation  to  an  object.  Ibis  is  an  ambiguity  of  the  mention/use  kind. 
However,  in  each  of  the  cases  (l)-(3)  context  is  sufficient  to  resolve  the  ambiguity. 

In  defining  the  typed  variant  of  the  p-calculus.  it  is  most  convenient  to  proceed  by 
assigning  types  (ie  formulas)  not  to  variables,  but  rather  to  the  nodes  of  p-terms.  In 
particular,  a  typed  p-irnn  is  a  p-term  some  of  whose  nodes  have  been  labeled  by  formulas 
according  to  certain  titles.  The  formula  assigned  to  a  given  node  represents  the  type  of  the 
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subtcrm  rooted  at  that  node  (or,  in  proof  language,  the  end-formula  of  the  subproof  rooted  at 
the  node).  We  follow  traditional  terminology,  and  refer  to  a  typed  p-term  as  a  construction. 
'Hie  words  "term"  and  "p-term"  will  be  used  to  denote  untyped  p-terms. 

Before  describing  the  rules  by  which  constructions  arc  to  be  built  up,  we  need  to  define 
the  notions  of  bound  and  free  occurences  of  variables,  and  of  substitution,  as  they  apply  to 
constructions.  We  use  the  phrase  "labeled  p-term"  to  refer  to  a  p-term  to  whose  nodes 
formulas  have  been  assigned  in  an  arbitrary  manner  (in  constrast  to  a  typed  p-term  or 
construction ,  whose  labeling  must  follow  certain  rules). 

I.et  t  be  an  labeled  p-term.  An  occurence  of  a  variable  in  t  is  an  occurence  of  the  variable 
either  as  a  leaf  of  the  p-term,  or  an  occurence  of  the  variable  in  one  of  the  formulas  assigned 
to  the  nodes  of  t.  The  notion  of  a  bound  occurence  of  a  variable  in  a  labeled  p-term  is 
defined  below.  The  definition  follows  standard  lines,  but  includes  new  clauses  for  the 
constructors  Kb  and  OK.  (T  he  new  clauses  express  the  fact  that  OK  and  KK,  like  A,  V  and  3, 
have  the  effect  of  binding  variables.) 

(1)  Kach  occurence  of  the  variable  a  in  t2  or  t^  (but  not  in  ti)  is  a  bound  occurence  of  a 

in  the  terms  (a)  OK(«,t|,t2,tJ),  (b)  KK(x,a,t,,t2),  (c)  \a.t2. 

(2)  Kach  occurence  of  the  variable  x  in  t2  is  a  bound  occurence  of  x  in  (a)  KK(x,a,t[,t2), 
and  in  (b)  Ay.t2  if  x  is  among  the  variables  y. 

(2)  Kach  occurence  of  the  variable  x  in  the  formula  <p  is  a  bound  occurence  of  x  in  (a) 
3x(ji,  and  in  (b)  Vy.<p  if  x  is  among  the  variables  y. 

(4)  If  t,  is  a  subterm  of  t2,  each  occurence  of  a  variable  in  t(  which  is  bound  in  t{  is  also 
bound  in  t ^ 

Any  variable  occurence  which  is  not  specified  as  bound  by  the  above  three  rules  is  a  free 
occurence  of  the  variable. 

The  elementary  operations  on  terms  -  notably  the  renaming  of  bound  variables  and 
substitution  -  arc  defined  in  exactly  the  same  way  for  the  p-calculus  (typed  or  untyped)  as 
they  arc  for  the  plain  A-calculus.  One  only  lias  to  take  the  new  variable-binding  constructors 
OK  and  KK  into  account  in  the  obvious  way. 

Kor  example,  the  definition  of  a  convcrsion  (renaming  of  one  bound  variable)  includes 
the  following  clause  for  OK:  Suppose  that  the  terms  t2',t2'  result  from  the  terms  t2,tj  by  the 
replacement  of  all  free  occurences  of  the  variable  «  by  the  variable  fl.  Suppose  further  that  /? 
docs  not  itself  occur  free  in  either  t2  or  t y  Then  one  may  replace  the  term  Oi’.(«,t|,t2,tj)  by 
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the  term  OHfa.tj.t^.tj').  The  other  clauses  for  a-conversion  arc  the  standard  clause  for  and 
two  clauses  for  KK  •  one  for  renaming  the  object  variable,  and  one  for  renaming  the  proof 
variable. 

The  operation  of  substitution  may  be  defined  as  follows:  in  order  to  substitute  the  term  t-, 
for  the  variable  x  in  tj,  first  rename  all  bound  variables  which  appear  in  t(  in  such  a  way  that 
no  free  variable  of  t2  has  a  bound  occurence  in  t,.  (This  can  be  done  by  a  scries  of  a- 
conversions.)  Then  replace  all  free  occurences  of  x  in  t,  by  t2.  The  result  of  this  operation 
will  be  denoted  by,  "t,lx  «-  t2]’\  Kvidently  the  above  definition  docs  not  fully  specify  the 
term  which  results  from  substitution  because  it  leaves  open  the  particular  choice  of  variables 
which  are  used  in  renaming.  However,  the  result  is  uniquely  defined  modulo  renaming  of 
bound  variables.  We  shall  henceforth  regard  as  identical  terms  which  differ  only  in  the  names 
of  their  bound  variables  (ic,  terms  which  can  be  transformed  into  each  other  by  means  of  a- 
conversions). 

The  notation  t1[t,'tvt,|  designates  the  result  of  substituting  t2  for  some  occurences  t.ftj  in 
t,.  Whenever  this  notation  is  used,  it  is  assumed  that  no  bound  variable  of  t(  appears  free  in 
t3.  (Thus,  no  free  variable  of  an  occurence  of  t3  within  t|  is  bound  by  a  constructor  of  t,.) 
As  in  the  case  of  substitution  of  terms  for  variables,  the  substitution  of  terms  for  terms 
involves  changing  of  bound  variable  names  in  t(  so  as  to  avoid  conflicts  with  the  variables 
which  appear  free  in  t,.  f  inally,  t1[x*-t2|  denotes  the  result  of  substituting  the  terms  t2  for 
the  variables  x  in  parallel. 

We  arc  now  in  a  position  to  define  the  notion  of  a  typed  p-lenn.  or  construction.  A 
construction  is  a  labeled  p-term  which  is  built  up  according  to  the  rules  given  below  and 
which  in  addition  satisfies  the  following  general  restrictions:  (1)  I '.very  occurence  of  a  proof 
variable  in  a  construction  t  must  be  labeled  by  a  formula.  (2)  Suppose  that  t'  is  tiny  subterm 
of  the  construction  t,  and  that  «  is  a  proof  variable  which  occurs  free  in  t'.  Then  every  free 
occurence  of  a  in  t'  must  he  labeled  by  the  same  formula. 

The  rules  for  building  up  constructions  given  below  correspond  exactly  to  the  inference 
rules  of  natural  deduction.  The  name  of  the  inference  rule  corresponding  to  each  rule  is 
given  in  brackets  next  to  the  rule.  We  make  use  of  the  notation  t:f'  to  indicate  a  construction 
whose  root  is  labeled  by  the  formula  f  .  (Other  nodes  of  t:F  than  the  root  may  be  labeled  by 
formulas  as  well).  Most  of  the  rules  are  given  in  the  notation  "t| : l-'j,  l2:f'2  .  .  .  t  :!'  =>  t:F", 
meaning  that  if  t j : I •  | ,  l,:f2  .  .  .  tn: l-’n  are  constructions  then  so  is  t: I*'. 

As  a  parameter  of  the  definition  given  below,  we  assume  that  a  collection  of  proof 
procedures  y,,y2  .  .  .  y(|  has  been  given  for  lemmas  I'jJi,  •  •  ■  !•'  .  In  the  current  context  - 
that  is  to  say.  in  the  context  of  a  discussion  of  constructions  -  a  proof  procedure  y  for  a 
formula  Vx(,x2  .  .  .  xkf'(X|,x2  .  .  .  xk)  is  a  procedure  which,  when  given  terms  t,,t2  .  .  .  tk  of  1., 


either  returns  "FAIL",  or  else  supplies  a  construction  ttF'ftj.tj  .  .  tk),  where  the  construction  t 
docs  not  itself  make  use  of  any  lemmas.  Further,  we  require  that  ••■**)  be  a  closed 
construction  whenever  tj,t2  ^  are  closed.  Thus  a  proof  procedure  in  the  context  of 
constructions  plays  the  same  role  as  the  proof  procedures  for  natural  deduction  proofs 
discussed  in  section  2.5.  We  assume  also  that  names  fj,f2,  .  .  .  fn  of  appropriate  aritics  from 
the  set  D  of  defined  symbols  (see  page  34)  have  been  assigned  as  labels  of  the  proof 
procedures  yj.yj  .  .  .  yn. 


The  clauses  of  the  inductive  definition  of  the  notion  of  a  construction  arc  as  follows. 
Note  that  we  have  required  that  the  axioms  which  appear  in  constructions  be  Harrop  formulas 
(clause  2  below).  Henceforward,  we  will  also  assume  that  the  proofs  which  we  consider 
contain  only  Harrop  axioms,  since  proofs  which  do  not  satisfy  this  requirement  arc  not  in  any 
ease  of  much  computational  interest. 

(1)  a:  A  is  a  construction  for  any  proof  variable  a  and  any  formula  A  [assumption]. 

(2)  If  F  is  any  Harrop  formula,  then  it  :F  is  a  construction,  [axiom] 

(3)  If  f  labels  a  proof  procedure  for  the  lemma  A  =  Vx1,x2  .  .  .  xkqp,  then  f:A  is  a 
construction,  [lemma] 

(4)  tj:A,  t2:l)  =*>  <tl,t2>:AAB  [A-introduction] 

(5)  (a)  t:AAB  =*  w,(t):A  (b)  t:AAB  =*  w2(t):B  [A-climination] 

(6)  (a)  t:A  =>  Ol1(t):AVB  (b)  t:B  =>  OI2(t):AVB  [V-introduction] 

(7)  Let  tjiAVB,  t2:C,  t^C  be  constructions,  and  let  a  be  a  proof  variable.  Suppose  that 
free  occurences  of  a  in  t2  arc  assigned  die  formula  A,  and  that  free  occurences  of  a  in 
t}  arc  assigned  the  formula  B.  Then  OFXa,tl,t2,tj):C  is  a  construction.  [V-climination] 

(8)  Let  t:B  be  a  construction  in  which  free  occurences  of  the  proof  variable  a  are 
assigned  the  formula  A.  Then  (Xa.t):ADB  is  a  construction.  [D-introduction] 

(9)  t,:ADB,  t2:A  =>  t^LB  [D-climination] 

(10)  Let  t;A  be  a  construction  with  the  property  that  no  variable  of  tire  vector  of 
variables  x  appears  free  in  any  of  the  formulas  assigned  to  the  free  proof  variables  of  t. 
Then  (Ax.t):VxA  is  a  construction.  [V-introduction] 

(11)  tt:VxA(x)  =>  t)(t2):A[x*-t2]  where  l2  is  any  vector  of  terms  of  L  [V-climination] 
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(12)  t2: Alx^-tj]  =*  EKtj.tjJiBxA  [3-introduction] 

(13)  Let  tj:3xA,  and  tjtC  be  constructions  satisfying  the  following  restrictions,  (a)  Free 
occurences  of  the  proof  variable  a  in  t2  are  assigned  the  formula  A.  (b)  I^t  F  be  any 
formula  which  is  assigned  to  a  free  proof  variable  of  tj  other  than  a.  Then  x  may  not 
appear  free  in  F.  (c)  The  variable  x  may  not  appear  free  in  C.  1'hen  EE(x,a,tl,t2):C  is  a 
construction.  [3-climination] 

Note  that,  since  we  do  not  distinguish  between  formulas  which  differ  only  in  the  names 
of  their  bound  variables,  the  identity  of  the  variables  bound  by  X  and  the  variables  bound  by 
V  in  ”(Xx.t):VxA"  of  rule  (9)  is  a  matter  notational  convenience  and  not  a  requirement. 
That  is  to  say,  for  any  new  tuple  of  variables  y,  "Xy.(t[x«-y]):VxA”  and  M(Ax.t):VxA”  are 
equivalent  labeled  p-terms  and  have  equal  standing  as  well  formed  constructions.  A  similar 
remark  applies  to  the  construction  of  rule  (11). 

Arbitrary  natural  deduction  proofs  can  be  r  written  as  constructions  by  a  straight-forward 
extension  of  the  methods  which  apply  to  proofs  of  pure  implicational  logic.  Specifically,  one 
starts  out  with  an  assignment  of  proof  variables  «A  to  formulas  A.  Then  the  map  T  from 
proofs  to  constructions  is  defined  by  induction  on  the  structure  of  proofs  just  as  it  was  in 
section  3.1.  What  T  docs  is  (l)  replace  each  assumption  [A]  by  the  variable  aA  assigned  to  A, 
(2)  replace  axioms  by  the  special  constant  #,  (3)  replace  lemmas  by  the  defined  symbols 
which  label  their  proof  procedures,  (3)  replace  each  inference  rule  by  the  corresponding 
constructor,  and  finally  (4)  label  each  node  of  the  p-term  by  the  formula  which  occurs  at  the 
corresponding  node  of  the  proof  tree.  The  passage  in  the  other  direction  is  even  more 
straight-pforward:  to  go  from  a  construction  to  a  natural  deduction  proof  one  keeps  the 
formulas  and  constructors  which  label  the  tree,  but  the  proof  variables  arc  thrown  away.  The 
clauses  of  the  inductive  definition  of  fare  given  below,  using  the  notation  T:  n  =>  t:F  to 
indicate  that  the  value  of  f  applied  to  fl  is  the  construction  t:F. 


(1)  Rase  case:  T:  [A]  =>  aA:A 

(That  is,  T  when  applied  to  a  proof  which  consists  simply  of  an  assumption  [AJ  yields  the 
construction  aA:A.) 

(2)  Rase  case:  F:  A  =>  #:A,  where  A  is  an  axiom. 

(f  when  applied  to  a  proof  which  consists  of  an  axiom  A  yields  the  construction  #:A.) 
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(8) 


T:  D I- 


(9) 


B 


A  D  B 


T:  DF- 


ni  n2 

A  A  D  B 


(10) 


r.  vi- 


n 

A 


VxA 


(ID 


r  vh- 


n 

VxA 

Afx-t] 


(12) 


T:  31 


n 

A(x-t| 

3xA 


(13) 


I’:  3F- 


n,  n2 

3xA  C 


C 


XvA.  T(n):A  D  B 


(i’(n,)xr(n2)):B 


Xx.r(n):VxA 


( l '( 1 1  )Xi) :  A  [  x. «- 1] 


i:i(t.[(n)):3xA 


i:iXx.rtA.i’(ii1).r(n2)):C 


Hie  construction  notation  Uc  for  the  upper  bound  proof  IJ  of  section  2.8  is  given  below 
as  an  example. 
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OE(a.i.HSSD(x,l). 

El(y  +  l.#(a)) 

; .  OK03,LESSD(y.l), 

KI(x  +  l  .#(/?)). 

F.I(2xy.vOf««.j8>)))):3z*(x.y,z) 

In  the  above  presentation  of  Uc,  only  the  root  node  is  explicitly  labeled  by  a  formula;  we 
have  neglected  to  specify  the  formulas  which  arc  attached  to  the  various  subterms  of  the 
construction.  A  complete  description  of  tire  construction  is  as  follows,  where  the  formula  F 
which  labels  each  subterm  t  is  specified  using  the  notation  "t:F". 

OH(rt,{l.l'SSl):Vxy(x<yVy<x)}(x,l):x<l  Vx>l, 

Kl(y  + 1,{  it  :x<lD'Hx.y.y  +  l)}03:x<l);'F(x,y,y  +  l)):3z'l'(x,y,z) 
OK(/?.{LF,SSI):Vxy(x<yVy<z)}(y.l):y<lVy>l, 

F.l(x+  1,{  it :y<lD'l,(x,y,x  +  l)}(/ff:y<l):'l,(x,y,x  +  l)):3z'l,(x,y,z), 
F.I(2xy,(#:(x>l)A(y>l)D'I'(x,y,2xy)) 

(<«:{x>l  }./?:{y>l}>:{x>l  Ay>l}):'b(x,y,2xy)))):3z'I'(x,y,z.) 

3.3  Substitution 

I  he  effect  of  the  principle  of  "substitution  of  equals  for  equals"  can  be  obtained  by  the 
use  of  a  scheme  of  llarrop  axioms  (as  was  done  for  FAl.SK  -  elimination;  see  section  2.1). 
However,  it  is  more  convenient  for  our  purposes  to  include  the  following  inference  rule  which 
expresses  this  principle  directly. 

t,  =  4  A 

and  - 

Aft^t,] 

On  the  p-calculus  side,  a  new  constructor:  SIl(t|.t2)  is  added,  and  the  clauses 

"l  °2 

t|  =  t2  A 

r.  sb -  =*  SB(r(n,).r(n2)):A(tlt»t2i 

Alt^tj] 


Substitution: 


Alt.^tj) 


SB(r(n1),r(n2)):A[t2'ut1i 


"1  "2 

ll  =  t2  A 

T:  SB - 

are  added  to  the  definition  of  the  map  f  from  natural  deduction  proofs  to  constructions. 


3.4  Recursive  constructions 

Recursive  definitions  of  functions  arc  commonly  used  for  describing  computational 
methods,  both  in  mathematics,  and  in  automatic  computation.  Most  programming  languages 
allow  defintion  by  recursion,  and  in  purely  applicative  languages,  such  as  pure  LISP 
[McCarthy  et  al,  1962],  the  principal  constructors  used  in  building  up  programs  are  just 
function  application,  and  recursive  definition. 

We  too  will  make  use  of  definition  by  recursion.  Specifically,  we  will  allow  (muiually) 
recursive  definitions  of  the  form: 

fl  “  ll:\ 

f2  ^  *2:A2 


where  the  {fj  arc  defined  symbols,  and  the  {t,}  arc  constructions  in  which  f(  .  .  .  f  may 
appear,  and  the  {/V}  arc  universal  formulas.  The  following  restrictions  apply:  (1)  each 
construction  t^Aj  must  be  closed,  and  (2)  each  occurence  of  a  defined  name  f  in  any  tj  must 
have  Aj  as  its  attached  formula. 

Putting  the  matter  more  formally,  we  implement  definitions  by  recursion  in  the  following 
way.  A  parameter  of  the  definition  of  the  class  of  constructions  is  the  set  of  assignments 
made  to  defined  symbols.  Until  now,  those  assignments  have  been  proof  procedures  with 
appropriate  charactcrsistics.  Henceforth,  we  will  allow  constructions  as  well  as  proof 
procedures  to  be  assigned  as  values  of  defined  symbols,  subject  to  the  restrictions  described  in 
the  last  paragraph.  Of  course,  each  defined  symbol  may  be  assigned  only  one  value,  whether 
a  proof  procedure  or  a  construction.  We  will  refer  to  a  set  of  assignments  of  constructions 
and  proof  procedures  to  defined  symbols  as  a  "system  of  definitions"  or  a  "system  of 
lemmas".  The  system  of  definitions  which  is  in  effect  for  the  purposes  of  any  particular 
discussion  will  be  referred  to  as  the  "current  system  of  definitions". 
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If  one  switches  back  from  the  terminology  of  constructions  to  that  of  natural  deduction 
proofs,  then  the  "recursive  proofs"  which  correspond  to  recursive  constructions  arc  proofs 
which  use  their  own  end-formulas  as  lemmas.  An  example  of  a  computationally  useful 
recursive  proof  is  as  follows. 

Let  pred  denote  the  predecessor  function  on  natural  numbers  (it  does  not  matter  what 
value  is  chosen  for  prcd(O)).  Then  one  formulation  of  the  induction  principle  for  the  formula 
<p(x)  is  as  follows: 

INLy  Vx(  {<p(0)  A  Vy  (y*0  A  <p(prcd  y))  D  <p(y))}  D  <p(x)) 

The  following  is  a  a  recursive  proof  of  IND  -  a  proof  in  which  1ND  itself  is  used  as  a 
lemma.  We  will  need  an  abbreviation.  Let  H  be  the  formula: 

Vy  (y*0  A  qp(prcd  y))  D  <p(y)) 

Then  IND^  is  just  Vx{<p(0)  AMD  <p(x)).  The  proof,  then,  is  as  follows. 

lNn<j):Vx(  <p(0)AH  D  <p(x)) 

vi-: - 

[<p(0>  A 1 1]  (p(O)AHD(p(picd  x)  [<p(0)  A  H] 

Dlv - AH - 

[<p(0)  A  H)  |x*0)  «jr>( p red  x)  H 

AH - Al - VK - 

Vxy(x~yVx*y)  [x  =  0]  <p( 0)  x*()A<p(prcd  x)  x*0A<p(prcd  x)  D  <p(x) 

VK - SB - - - DK - 

X  =  0  V  X*0  <p(x)  (jp(x) 

VK, - - - 

<p(x) 

DI - 

9>(0)AM  D  <p(x) 

Vi - 

Vx(<p(0)  A  H  D  <p(x)) 


In  the  notation  of  constructions,  the  above  proof  of  IND^  looks  like  this: 

INI)  =  /\x.\«.OK(/LKQl)(x,0), 

SB (/?,7T  ,(«)), 

{(w2(n))(p red  x)}(<j8,(INI)v(prcd  x))(n)>)) 

where  LQI)  is  a  proof  procedure  for  the  fomuila  Vxy(x  =  yVx*y);  HQD  returns  die 
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construction,  "01 ,( # : tj = t2): tx  =  ^ V tj"  if  t2  and  4  are  c*osed  terms  with  t^t^  and 
"OI2(#:t1^t2):t1=t2Vt1^t2"  if  tj  and  tj  are  closed  terms  with  tj^tj. 

The  usefulness  of  this  construction  derives  from  the  fact  that  the  value  to  which  the 
lemma  IND  is  applied  is  the  predecessor  of  the  value  to  which  the  theorem  IND  is  applied. 
The  construction  may  be  executed  when  applied  to  a  particular  numeral  in  the  same  way  that 
a  recursively  defined  function  is  run:  by  repeated  replacements  of  the  defined  name  IND  by 
its  definition.  As  a  consequence  of  the  fact  that  the  value  passed  to  succcsive  recursive  calls 
to  IND  is  constantly  decreasing,  this  mode  of  execution  will  terminate  (under  the  right 
reduction  order),  yielding  a  construction  in  which  no  reference  to  IND  any  longer  appears. 
The  details  of  this  process  will  be  discussed  later  (section  3.5). 

A  somewhat  simpler  way  of  achieving  the  effect  of  induction  by  the  use  of  recursive 
proofs  is  as  follows.  Suppose  that  one  has  a  proof  II  t  of  <p( 0),  and  a  proof  Il2  of  Vy  (y*0  A 
<p(prcd  y))  D  <p(y)).  'Chen  the  following  recursive  proof  P^  of  Vxtp(x)  is  adequate  to  the 
same  computational  purposes  as  is  the  above  proof  of  IND^. 

PyiVxf  <p(x)) 

VB - 

[x*0]  <p(prcd  x) 

X\i  Al - 

Vxy(x  =  yVx*y)  [x  =  0]  <p(0)  x*0  A<p(pred  x) 

VB -  SB - DH - 

x  =  0  V  x*0  <p(x) 

VB - 

<p(x) 

VI - 

Vx<p(x) 

The  construction  notation  for  P  is  as  follows,  where  tj  is  the  construction  notation  for 
n |,  and  ^  the  construction  notation  for  fl2. 

P9  =  Ax.OB(«,HQD(x,0), 

SB(a,t,), 

(t?(x))««,P<p(predx))) 

Suppose  that  a  system  S  of  lemmas  has  the  property  that  every  axiom  in  sight  is  true  in  a 
particular  model  M.  That  is  to  say,  we  suppose  that  all  the  axioms  which  appear  in 
constructions  of  S,  and  all  axioms  which  appear  in  the  constructions  generated  by  proof 
procedures  of  S,  arc  true  in  M.  Note  that  these  conditions  arc  still  not  sufficient  to  guarantee 


n2 

Vy(y?t0  A  <p(prcd  y)  D  <p(y)) 

VB - 

x*0  Ay(pred  x)  D  <p(x) 

*p(x) 


that  constructions  built  up  from  lemmas  of  S  will  have  end-formulas  which  arc  true  in  M. 
The  reason  for  this  is  that  constructions  can  take  the  form  of  circular  arguments.  Consider, 
for  example,  recursively  defined  construction  :  "f:VxA  «-  f:VxA”,  which  of  course  provides 
no  evidence  at  all  for  the  truth  of  VxA.  In  order  to  verify  the  truth  of  the  end-formula  of  a 
construction,  or  the  correctness  of  the  computation  described  by  the  construction,  it  is  not 
sufficient  to  verify  the  axioms  which  arc  used  (directly  or  indirectly)  by  the  construction.  It  is 
also  necessary  to  verify  the  truth  of  the  lemmas  v  hich  arc  used,  even  though  (recursive) 
constructions  for  those  lemmas  have  been  supplied. 


3.5  Operations  on  constructions 

In  this  section,  the  various  elementary  operations  which  arc  involved  in  the  computational 
use  of  constructions  arc  described.  These  operations  arc:  (a)  the  normalization  reductions,  and 
(b)  the  pruning  operations.  These  operations  arc  arrived  at  by  direct  translation  into 
construction  notation  of  the  operations  on  natural  deduction  proofs  given  in  sections  2.4,  2.7. 

A-reductiou: 

w,(<t,:A.t2:B>:AAB):A 

w2«t,:A.t2:B>:AAB):B 

V-  reduction: 

0l-(a.01l(t1:A):AVB.t2 

0l;.(rt,012(t|:B):AVIU2 

D-reduction: 

KXn4t,:B)):ADB}(l2:A)  :B 
V- reduction: 

{(Xx.(t|:A»:VxA}(t2)  :A 
3  reduction: 

i:i(x.<v,l.l(t1,t2:A):3xA),tJ:C)  =>  (tJ[x*-t,])[«»-t2]:C 


‘ili^Aury 


t,[«-t2]:B 


:C.t3:C):C 

:C,t3:C):C 


t|:  A 


t,:B 


tjla-t^C 

t3(a'-tl]:C 
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Lemma-reduction: 


{(f:VxAXt)}:Alx*-t)  =>  y(t):A[x*-t] 

condition:  f  has  been  assigned  the  proof  procedure  y,  and  Y(t)*FAIL. 
{(f:VxA)(t)}:A[x<-t]  =>  t'(t):A{x*-t] 

condition:  t  is  closed,  and  f  has  been  assigned  the  construction  t'. 

In  addition,  a  reduction  rule  for  the  new  substitution  inference  of  section  3.3  is  needed: 

SIKt1:(tJ=t4),t2:A[x<-tj]):A[x«-t4]  =*  t^Afx*-^] 

condition:  t3  may  not  contain  free  proof  variables. 

The  effect  of  SB-rcduction  is  to  take  the  construction  t2  and  simply  replace  the  formula 
A[x«-tj]  attached  to  its  root  by  A[x»-t4],  and  thus  dispensing  with  the  SB  inference  rule. 
Hvidcntly,  if  t3  and  t4  arc  distinct  terms,  then  the  result  of  applying  SB-rcduction  to  a 
construction  will  be  a  labeled  p-term  which  is  no  longer  a  construction.  However,  if  the 
axioms  which  appear  in  tt  arc  correct  (in  some  particular  model)  then  t(  and  t4  will  denote  the 
same  object  in  the  model,  and  in  this  sense  the  formulas  A  and  A  have  the  same  meaning.  In 
fact,  nothing  will  go  wrong  if  we  fail  to  distinguish  between  formulas  which  differ  only  by 
substitution  of  one  term  of  I.  by  another  which  denotes  the  same  object.  More  formally, 
relative  to  any  particular  model,  we  may  expand  the  class  of  constructions  to  include  all  of 
those  labeled  p-terms  which  can  be  arrived  at  by  substitution  of  "equals  for  equals"  in 
formulas.  I  bis  mars  the  uniformity  of  our  treatment  in  that  introduces  model-theoretic 
considerations  into  the  defintion  of  the  notion  of  a  construction,  whereas  that  notion  has  been 
purely  syntactic  until  now.  However,  as  we  have  said,  none  of  our  results  are  affected. 

The  pruning  reductions  arc  as  follows: 

OFXa.tjiAVB.tjiC.tjiC)  :C  =s>  t^C 

Condition:  «  docs  not  appear  free  in  t2. 

01X«,t,:AVB,t2:C,t3:C)  :C  =*  t3:C 

Condition:  a  does  not  appear  free  in  tj. 


Note  that  each  of  the  operations  described  in  this  section  applies  to  the  untyped  part  of  a 
construction  independently  of  the  attached  formulas,  in  the  following  sense.  Ijet  t:F  be  a 
construction,  and  let  t':F  be  the  result  of  applying  one  of  the  operations  listed  above  to  t:F. 
Further,  let  untyp(r)  for  any  construction  r  denote  the  untyped  p-term  which  forms  the 
"skeleton"  of  r  -  that  is  to  say,  untyp(r)  is  arrived  at  from  r  by  removing  the  formulas  which 
label  the  nodes  of  r.  Then  untyp(t')  can  be  computed  from  untyp(t)  alone.  As  was 
mentioned  ca..icr  in  the  context  of  the  typed  A-calculus,  the  consequence  of  this  observation 
for  computational  purposes  is  that  the  execution  and  pruning  of  a  construction  may  be  carried 
out  by  treating  only  its  untyped  part;  the  attached  formulas  need  not  be  carried  around  in  the 
course  of  the  computation.  In  order  to  clarify  the  manner  in  which  the  various  operations 
apply  to  untyped  p-terms,  we  list  those  operations  below  with  the  type  information  left  out. 


Areduction: 

w  l(<tll?>) 

- 

ll 

w2(<tft2>) 

- 

h 

V-  reduction: 

OF<a.Oll(t1).t2.tJ) 

t2[«*-t1) 

OF(a,OI2(t1),t2,t,) 

=> 

tjla^tjJ 

□-reduction: 

(Aa.t,)(t2) 

- 

V-rcduction: 

(Xx.tj)(t2) 

=> 

3-rcduction: 

KF(x,a,KI(tl.t2),t}) 

=> 

l.cmma-rcduction: 

dt) 

=> 

y(t) 

condition:  t  is  closed,  and  f  has  been  assigned  the  proof  procedure  y. 
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condition:  t  is  closed,  and  f  has  been  assigned  the  construction  t', 


SB-rcduction: 

SB(tl,t2) 


Pruning: 

OEfa.tj.t^tj) 


condition:  tL  docs  not  contain  free  proof  variables. 


Condition:  a  docs  not  appear  free  in  tj. 
OH(«,t,,t2,tj)  = 

Condition:  a  docs  not  appear  free  in  t y 


3.6  Results  about  constructions 

As  was  emphasized  in  section  2.9  in  the  context  of  natural  deduction  proofs,  the  results 
and  conditions  which  are  relevant  to  the  computational  use  of  proof  normalization  arc  of 
several  independent  kinds.  Specifically,  there  are  results  concerning 

(1)  the  syntactic  soundness  of  the  normalization  reductions, 

(2)  the  semantic  soundness  of  proofs, 

(3)  special  properties  of  proofs  in  normal  form,  and  finally, 

(4)  the  termination  of  reduction  sequences. 

The  conditions  upon  which  the  results  in  one  catagory  depend,  and  the  proofs  of  those 
results,  are  for  the  most  part  unrelated  to  the  conditions  and  proofs  which  come  up  in  the 
other  catagorics.  In  sections  2.3  and  2.9,  results  about  normalization  were  stated  without  proof 
as  they  apply  to  natural  deduction  proofs.  In  this  section,  we  will  prove  or  sketch  proofs  of 
the  corresponding  results  which  apply  to  constructions. 
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Let  R  be  the  set  of  all  the  elementary  operations  on  constructions  which  were  described 
in  the  last  section.  We  have: 

Proposition  1:  Each  of  the  operations  of  R  yields  a  construction  when  applied  to  a 
construction. 

Proposition  2:  Each  of  the  operations  of  R  yields  a  closed  construction  when  applied  to  a 
closed  construction. 

Proposition  3:  Each  of  the  operations  R  preserves  the  end-formula  of  the  construction  to 
which  it  is  applied. 

The  above  propositions  can  be  verified  by  inspection. 

Definition:  A  construction  t:F  is  a  valid  construction  relative  to  a  model  M  if  the 
universa'  closure  of  each  axiom  and  each  lemma  which  appears  in  t  is  true  in  M 

Definition:  A  system  of  lemmas  is  valid  relative  to  M  if  each  construction  which  appears 
in  the  system  is  valid  relative  to  M,  and  if  each  proof  procedure  y  which  appears  in  the 
system  returns  only  valid  constructions. 

Proposition  4  (Soundness):  Suppose  that  t:F  is  a  construction  with  free  proof  variables 
a,  .  .  .  an,  and  free  object  variables  x,  .  .  .  xn.  Suppose  further  that  t:F  is  valid  relative  to  M. 
Let  A,  ...  An  be  the  formulas  which  arc  attached  to  the  free  proof  variables  ax  .  .  .  an. 
Then  Vx1  x2  .  .  .  xn(  (AL  A  A2  .  .  .  An)  D  F)  is  true  in  M. 

Proof:  Induction  on  the  structure  of  constructions;  each  of  the  rules  by  which 

constructions  arc  built  up  preserves  soundness. 

Proposition  5:  Suppose  that  (a)  t:F  is  valid  relative  to  M,  and  (b)  the  current  system  of 

lemmas  is  valid  relative  to  M.  Then  the  result  of  applying  any  of  the  operations  of  section 

3.5  to  t:F  is  a  valid  construction. 

Proof:  Observe  that,  with  the  exception  of  the  lemma-reduction  operation,  all  operations 
in  R  modify  the  axioms  appearing  in  constructions  only  by  instantiating  free  variables  which 
appear  in  those  axioms.  The  proposition  follows. 

Definition:  We  classify  constructors  as  either  "introduction  constructors"  or  "elimination 
constructors"  according  to  their  correspondence  to  the  introduction  rules  and  elimination  rules 
of  natural  deduction.  The  introduction  constructors  arc  PAIR,  0[(,  OI2,  A-ABSTRACTION, 
and  FI,  and  the  elimination  constructors  arc  wl,  w2,  OH,  APPLY,  EE,  and  SB.  (SB 
"eliminates"  an  equation.) 
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Theorem  3.1:  l.et  t:F  be  a  closed  construction  in  normal  form  where  F  is  not  a  Harrop 
formula.  Then  cither  l:F  is  a  lemma  (ic  has  the  form  f:F  where  f  is  a  defined  symbol),  or  the 
main  constructor  of  t  is  an  introduction  constructor. 

Proof:  By  induction  on  the  structure  of  proofs.  Suppose  that  t:F  is  a  nomial  closed 
construction  where  F  is  not  Harrop.  Base  case:  If  t  is  an  "atomic"  construction  consisting  of 
one  node,  then  it  must  be  cither  (1)  an  assumption,  (2)  a  Harrop  axiom,  (3)  a  lemma  f:F. 
Cases  (1)  and  (2)  arc  impossible:  (1)  because  t  is  closed,  (2)  because  is  F  not  a  Harrop 
formula.  Thus  ease  (3)  must  hold,  and  so  the  base  of  the  induction  is  verified.  Furthermore 
we  have  verified  that  any  t:F  which  is  not  a  lemma  (and  which  satisfies  the  hypotheses  of  the 
theorem)  must  have  a  "main”  constructor,  whether  it  be  an  introduction  contmctor  or  an 
elimination  constructor,  since  t  cannot  be  atomic.  For  the  induction  step,  we  assume  that  the 
proposition  holds  for  each  sub-construction  of  t:F,  and  then  derive  a  contradiction  from  the 
supposition  that  the  main  constructor  of  t  is  an  elimination  constructor.  There  are  6  cases  to 
consider.  Suppose  that  the  main  constructor  of  t  is  (1)  7r(,  (2)  7t2.  (3)  OF.,  (4)  APPLY,  (5)  KF., 
(6)  SB.  Then  t:F  has  one  of  the  forms  (1)  w,(t|:FAG):F  (2)  w2(t1:GAF):F  (3) 
OF.(a,t1:AVB.t,:F.t3:F):F  (4)  (t,:GDF)(t2:G):F  or  (t,:VxA)(t2):A[x«-t?]  (5)  FK(t,:3xA,t2:F):F 
(6)  SB(tj,t2):F.  In  case  (6)  t(  is  closed,  and  therefore  SB  reduction  can  be  applied,  contrary  to 
the  hypothesis  that  t  is  in  normal  form.  In  all  other  eases,  t,  is  closed  and  has  a  non-Harrop 
end-formula,  so  the  induction  hypothesis  applies.  Thus  t,  is  either  ’  lemma,  or  else  has  an 
introduction  constructor  as  its  main  constructor.  If  t,  is  a  lemma,  then  t  must  have  the  form 
t,(y,  where  t2  is  closed,  and  thus  lemma- reduction  could  have  been  applied,  contrary  to  the 
hypothesis  that  t  is  normal.  If  t[  has  an  introduction  constructor  as  its  main  constructor,  then, 
by  virtue  of  the  form  of  the  end-lbrmula  of  t,.  that  main  constructor  must  be  (1)  PAIR,  (2) 
PAIR,  (3)  01 1  or  OI2,  (4)  X-ABS  TRACTION,  (5)  F.l.  But  then  one  of  the  reduction  rules 
(I)  A-rcduction,  (2)  A-rcduction,  (3)  V-reductioit,  (4)  D-reduction  or  V-reduction,  (5)  3- 
rcduction,  can  be  applied  to  t,  again  contrary  to  the  hypothesis  that  t  is  in  normal  form. 

Corollary  1:  If  t:3xA  is  closed  and  normal,  then  t  has  the  form  F.I(t1,t2:A[x«-t1|):3xA.  If, 
in  addition,  t  is  valid  relative  to  M,  then  A(t()  holds  in  M. 

Corollary  2:  If  t:AVB  is  closed,  and  normal,  then  t  has  one  of  the  forms  Ol^t^A^AVB, 
or  OI2(t(:B):AVB.  If,  in  addition,  t  is  valid  relative  to  M,  then,  in  the  first  ease,  A  bolds  in 
M.  and  in  the  second  case  B  holds  in  M. 

The  following  corollary  of  the  various  results  given  above  establishes  the  usefulness  of 
normal  i/at  ion  for  computational  purposes,  and  the  conditions  for  the  partial  correctness  of  a 
construction  regarded  as  a  computational  description. 

Corollary  3:  l  et  t|:Vx3y<p(x.y)  be  a  closed  construction  and  let  t?  be  a  closed  term  of  L. 
Suppose  that  some  sequence  of  applications  of  operations  of  R  to  the  construction 
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tl(t2):3y<p(t2,y)  yields  a  normal  construction  t'.  llicn  t'  has  the  form  Kl(tj,t4):3yqp(t2.y). 
Further,  if  tt  is  valid  relative  to  a  model  M,  and  the  current  system  of  lemmas  is  valid  relative 
to  M,  then  <p(t2,tj)  is  true  in  M. 

Corollary  3  shows  that  normalization  constitutes  a  satisfactory  means  for  executing  a 
construction  t,:Vx3y<p(x,y)  in  the  sense  that  if  one  "puts  in"  a  value  tj  for  x,  and  if 
normalization  terminates,  then  a  value  tj  for  y  comes  out.  In  addition,  the  corollary  shows 
that  t|  regarded  as  a  program  is  partially  correct  with  respect  to  the  input-output  specification 
<p,  under  the  condition  that  all  lemmas  and  axioms  in  sight  are  true.  Ihus  the  verification  of 
the  partial  correctness  of  an  algorithm  expressed  by  a  construction  is  a  matter  of  establishing 
the  truth  of  formulas  which  appear  explicitly  in  the  construction  and  in  its  system  of  lemmas. 
As  a  consequence,  the  passage  from  a  construction  to  its  "vcrfication  conditions"  is  simpler  for 
constructions  than  for  computational  descriptions  of  a  more  conventional  kind. 

We  turn  now  to  the  question  of  termination. 

Definition:  A  construction  t:l;  has  the  "termination  property"  if  every  sequence  of 
applications  of  operations  in  R  to  t  is  finite.  That  is,  there  is  no  infinite  sequence  of  terms 
tj,  t2,  .  .  .  such  that  t,  =  t,  and  such  that  tjf ,  arises  from  t(  by  the  application  of  one  of  the 
reductions  of  R. 

Theorem  3.2  (Termination):  Suppose  that  t:T  is  a  recursion-free  construction  in  in  the 
sense  that  all  defined  symbols  which  appear  in  t  arc  assigned  proof  procedures  and  not 
constructions.  Then  t  has  the  termination  property. 

Hie  standard  proof  of  the  termination  of  normalization  for  the  predicate  calculus  (sec  eg 
Prawitz{l%9|)  or  equivalently  for  the  typed  A-calculus  (see  I  roelsira  J1973AJ)  applies  to  the 
calculus  of  constructions  with  only  minor  technical  modifications.  Therefore,  we  omit  the 
proof  of  theorem  3.2  here. 

Hvidently,  if  recursively  defined  symbols  appear  in  a  construction,  the  termination 
theorem  no  longer  applies.  Indeed,  there  arc  recursive  definitions  of  a  symbol  f  (such  as  the 
looping  definition  "f:VxA  «-  f:VxA")  which  have  the  property  that  no  finite  sequence  of 
reductions  of  fit)  where  t  is  closed  can  lead  to  a  normal  form. 

Consider  the  formulation  of  first  order  arithmetic  which  is  arrived  at  by  taking  the 
members  of  the  schema  IND^  as  the  only  recursive  constructions.  Kven  here  the  termination 
property  fails.  The  reason  for  this  is  that  one  is  free  to  repeatedly  apply  lemma-reduction  to 
lND<p(t),  with  t  closed,  without  performing  any  other  reductions,  and  this  process  will  not 
terminate.  However,  termination  can  be  guaranteed  if  an  additional  restriction  is  made  on 
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lemma-reduction  as  it  applies  to  the  induction  schema.  Namely,  we  require  that  if  lemma' 
reduction  is  applied  to  IND^t)  yielding 

t'  =  Xx.Aa.OE(/?,EQD(x,0), 

SB(/?,Wj(a)), 

{(w2(a))(pred  xJK^.flND^fprcd  x)Xa)>))  (t) 
then  t'  must  be  brought  immediately  into  one  of  the  two  forms 


or 


Xa.(wj(a)) 


Aa.  ({(w2(a))(prcd  t)}(<#,(INI)(p(prcd  t))(a)>)) 

(depending  on  whether  the  value  of  t  is  zero)  before  any  other  reductions  arc  applied.  Clhis 
immediate  reduction  of  t'  will  involve  one  application  of  D-rcduction,  one  application  of 
lemma-reduction  to  FQD,  one  application  of  V-reduction,  and  perhpas  one  application  of  SB- 
rcduction.)  When  this  restriction  is  made,  the  effect  of  lemma-reduction  together  with  the 
immediately  succeeding  reduction  steps  is  very  much  like  that  of  the  induction-reduction  rule 
in  the  usual  formulation  of  normalization  for  first  order  arithcmtic  (see  Prawitz[1965]).  The 
restriction  results  in  a  system  with  the  strong  normalization  property  -  a  fact  which  can  be 
demonstrated  by  minor  modification  of  the  standard  proof  of  strong  normalization  for 
arithmetic  (Troclstra[1973B]).  Further,  theorem  3.1  continues  to  apply,  since  we  have 
restricted  only  the  order  in  which  reductions  may  be  applied,  and  have  not  thereby  modified 
the  notion  of  a  construction  in  normal  form. 

1. caving  aside  the  special  ease  of  arithmetic,  the  situation  is  this.  One  may  take  any 
algorithm  which  is  expressed  by  an  (ordinary)  recursive  definition  and  reformulate  it  as  a 
recursive  construction;  the  form  of  the  recursions  in  the  construction  will  be  identical  to  the 
form  of  the  recursions  in  the  original  definition.  (A  concrete  example  is  given  in  chapter  4.) 
If  the  ordinary  recursive  definition  terminates  under  some  particular  order  of  evaluation  (eg 
call-by-valuc  or  call-by-name),  then  so  will  the  recursive  construction  under  a  corresponding 
reduction  order  for  normalization.  We  do  not  propose  to  investigate  here  the  general  question 
of  the  termination  of  the  normalization  of  recursive  constructions.  It  is  sufficient  for  the 
current  purposes  to  observe  that  the  particular  reduction  order  which  we  use  in  the 
implementation  (namely,  the  call-by-valuc  order)  terminates  on  the  particular  proof  which 
concerns  us  (namely,  the  bin-packing  proof  of  chapter  4). 
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3.7  Another  reduction  rule 


An  additional  reduction  rule  beyond  those  so  far  mentioned  is  used  in  the  normalization 
of  the  bin-packing  proof  of  chapter  4  -  namely,  the  permutation  rule  for  the  V-climination 
inference: 
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In  construction  notation,  this  is: 


OE(o.OI;.(/?.tl:AVIM2:CVI),t3:CVI)).t4:E,t5:i:):E  => 

OE(^.t|:AVH.OE,(rt.l2:CVn.t4:E.ts:F.):E..OF.(«.tJ:CVI).t4:IU5:K):E):E 

where  it  is  assumed  (without  loss  of  generality)  that  /?  docs  not  appear 
free  in  either  t4  or  t5. 

None  of  the  results  concerning  constructions  given  in  section  3.6  is  affected  by  the 
addition  of  this  rule.  This  is  immediate  for  all  results  concerning  the  properties  of 
constructions  in  normal  form,  since  any  construction  which  is  in  normal  form  with  respect  to 


the  reduction  system  which  includes  the  new  permutation  rule  is  u  fortiori  in  normal  form 
with  respect  to  the  reduction  system  without  this  rule.  The  only  result  which  needs  checking 
is  the  termination  theorem  (theorem  3.2).  But,  as  it  happens,  standard  proofs  of  this  theorem, 
such  as  that  given  in  Prawitz(l%9],  treat  reduction  systems  in  which  permutation  reductions 
arc  included. 


3.8  Kffccts  of  pruning  on  efficiency 

To  avoid  misunderstanding:  The  principal  evidence  which  we  will  provide  concerning  the 
utility  of  pruning  in  improving  efficiency  is  the  bin-packing  example  of  chapter  4.  But  to  help 
in  choosing  other  examples  where  pruning  is  likely  to  be  of  use,  it  is  desirable  to  illustrate  the 
features  on  which  the  behaviour  of  pruning  depends  in  a  simple  and  abstract  setting.  With 
this  in  mind,  we  make  the  following  formal  points  by  means  of  schematic  examples. 

(1)  Pruning  can  lead  to  a  very  large  increase  in  the  efficiency  of  an  algorithm  which  has 

been  specialized. 

(2)  Pruning  can  lead  to  a  very  large  decrease  in  the  efficiency  of  an  algorithm. 

(3)  The  inclusion  of  proofs  of  I  larrup  formulas  can  improve  the  effectiveness  of  pruning. 

Consider,  then,  the  following  proof: 
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[B(x)l  (G(y)l 


\  * 


L,  i 


\xy.  OI;(rt.t,.l;.l(i|, #(<«)) 

OI(/?.t,  I  'l(r,  .#(/?)). 

J'l(rj.#«rt./?>)))):3/C(x,y,/.) 


where  t,,t,  are  the  construction  notations  lor  11,  II-,. 


Now,  consider  the  result  of  speeiali/ing  the  construction  f  to  a  particular  value  for  y;  say 
y  r()  where  r()  is  a  closed  term  of  I  .  lhe  specialized  construction  may  be  written. 


/\\.{Ux.r())}  --  \x.  Ol;(n.t|.l;l(r,.#(«)) 

Ol(//,t,|y-r0i,ri(r?.zy(/f)). 

I  Hr,,  it  (<«./?>»)): 3/C(x,r0,/) 


Suppose  that  the  normal  form  of  t,|y*-r()|  is  01, (t,)  where  t,  docs  not  contain  o  free  - 
that  is  to  say.  suppose  that  t,  when  normalized  returns  the  decision  that  I •’(!„)  and  not  G(r0) 
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holds,  and  also  that  the  proof  of  this  decision  docs  not  use  the  assumption  B(x).  Then  the 
result  of  normalizing  \x.{f(x,r0)}  without  pruning  is, 

(a)  \x.  OFX«,t1,EI(r1,#(a)).F.l(r2,#(t3))):3zC(x,r0,z), 

Pruning  can  be  applied  to  the  above  expression,  yielding 

(b)  EI(r2,#(t3)):3zC(x,r0,z) 

Now,  suppose  that  t2  represents  an  extremely  slow  algorithm,  so  that  t(  applied  to  any 
particular  argument  takes  a  long  time  to  normalize.  Then  the  passage  from  (a)  to  (b) 
represents  a  large  increase  in  efficiency:  the  normalization  (without  pruning)  of  the 
construction  (a)  on  an  input  r  requires  that  t3[x «- rj  be  normalized,  whereas  the  construction  (b) 
supplies  the  output  "r2"  for  all  inputs,  and  so  requires  no  reduction  steps  at  all. 

How  slow  can  be  t[  be?  The  answer,  for  all  practical  purposes,  is,  arbitrarily  slow,  since 
recursive  constructions  can  "run”  as  slowly  as  any  recursive  function.  Even  if  t(  is  a 
recursion- free  construction,  normalization  can  still  take  so  long  as  to  be  completely  infeasible. 
In  particular,  there  is  no  elementary  recursive  function  in  n  which  bounds  the  numcr  of 
reduction  steps  required  to  normalize  non-recursivc  constructions  of  size  n  [Statman  1977). 
(In  other  words,  there  is  no  such  bounu  of  the  form  2n,  or  of  the  form  2^  or  of  the  form 

and  so  on). 

So,  we  have  demonstrated  point  (1)  above.  Point  (2)  can  be  demonstrated  using  a  similar 
schematic  example.  Consider  the  following  proof. 
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[F(y)]  F(y)DC(x.y,r2) 

DE - 

C(x,y,r2) 

n2  31 - 

F(y)VG(y)  3zC(x,y,z) 


[G(y)J  G(y)DC(x,y,r3) 

DE - 

C(x,y,rj) 

31 - 

3zC(x,y,z) 


3zC(x,y,z) 


Vxy3/.C(x,y,z) 


The  above  proof  differs  from  the  first  proof  P2  only  in  that  ”C(x,y,r3)"  no  longer  depends 
upon  "B(x)".  The  construction  notation  for  P2  is: 

g  =  Axy.  OF(a,t,,r:i(ri,^(a)) 

OE0».t2.EI(r  2.  #(/?)), 

Hl(r3,  #(/?)))):  3/.C(x,y,z) 

If  pruning  is  applied  to  g,  one  gets 

\x  y.  OH(/3,t2,EI(r2,  #(/?)), 

F.I(r3,#(/?))):3zC(x,y.z) 

Now,  suppose  in  this  ease  that  t[  is  a  fast  algorithm,  that  is,  that  t,[x»-r]  can  be 
normalized  in  just  a  few  steps  for  each  input  r.  Suppose  further  that  t2  is  very  slow.  Then  we 
have  the  following  situation:  whenever  A(x)  holds,  r,  may  be  immediately  returned  as  the 
output,  but  when  IJ(x)  holds  a  long  computation  must  be  undertaken  to  determine  which  of 
F(y)  and  G(y)  holds.  However,  the  correctness  of  the  "long  computation"  docs  not  depend  on 
whether  Ii(x)  holds.  Thus  we  have  a  fast  way  (tj )  of  discriminating  between  two  ways  of 
computing  a  satisfactory  output,  one  of  which  is  very  fast  (the  simple  return  of  r3),  and  the 
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other  of  which  is  very  slow  ("Ax  y.  Olv0?,t2.HI(r2.^03)),KI(rj,#(/?))):3zC(x,y Further, 
the  slow  way  always  works.  Pruning  has  the  effect  of  throwing  away  the  discrimination  (tj) 


and  choosing  the  slow  way  every  time.  F.vidently,  if  A(x)  holds  for  many  values  of  x,  then 
pruning  degrades  the  average  efficiency  of  the  algorithm.  In  the  extreme  case  where  A(x) 
holds  for  all  x,  pruning  takes  a  very  fast  algorithm  and  replaces  it  by  a  very  slow  one. 

Point  (2)  has  now  been  demonstrated,  and  we  turn  to  point  (3).  As  we  have  seen,  all 
proofs  of  Harrop  formulas  may  be  omitted  without  interfering  with  the  possibility  of 
"running"  a  proof  or  construction.  However,  we  will  show  here  that  the  inclusion  of  a  proof 
of  a  Harrop  formula  can  extend  the  possiblities  for  pruning.  As  a  consequence,  the  inclusion 
of  proofs  of  Harrop  formulas  can  in  some  eases  improve  the  effectiveness  of  pruning  in 
optimizing  algorithms.  We  consider  a  third  minor  variant  of  the  original  schematic  proof  P.: 
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In  this  ease  the  change  from  P,  is  that  C(x,y,r?)  now  appears  to  depend  on  both  B(x)  and 
F(y).  The  contruction  notation  for  this  proof  is. 


Axy.  OK(«,t(,Hl(r, ,#(«)) 

OF.(jS.t2.F1(r2,t3«a./?>)). 

HI(r3,tf(<o,/?>)))):3zC(x,y,z) 


•v-*'  -w  r-1 


Wc  assume  for  the  current  discusioti  that  C  is  a  Harrop  formula.  lluis 
"(!)(x)AI'(y))DC(x,y,r2)"  is  a  Harrop  formula,  and  could  have  been  given  simply  as  an  axiom. 
Suppose  however  that  the  proof  ll3  of  "(B(x)Ab'(y))IX?(x,y,r2)"  has  the  form: 


(f'(y)J  [Ky)l 

Al - 


H  (y)l  [H(y)l 
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l;(y)All(y)  (l'(y)All(y))DC(x.y.r2) 
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Thus  C(x.y.r>)  may  or  may  not  actually  depend  on  B(x):  if  ll(y)  holds  it  doesn’t,  and  'f  I(y) 
holds  it  does.  The  construction  notation  t4  for  the  above  proof  is, 

01;(Y.t.|.  #«/Ty».  #(<«,</!,  Y»)):C(x,y,r2) 


So.  h  has  the  form: 

li  --  Axy.  Of j,l-'f(r,. >) 

OI;(/f.t2.i:i(r2.Oi:(Y.l4.#(</?.Y».^(<«.</J,Y»))). 

Idfr,.#  (<rt./i>)))):3/C(x,y,/) 

Suppose  that  li  is  specialized  to  Ax.h(x,r()),  where  ll(r0)  and  l'(r())  hold  (according  to 
normalization  of  t2  and  t,.  which  yield  01((i4)  and  Olj(tj)  respectively,  we  assume  that 
neither  l4  nor  contains  a  free).  Then  if  Ax.h(x.r())  is  nomtalized  without  pruning  the 
following  construction  results. 

Ax.  01;.(rt.t|.l:.l(r, .#(«)) 

i:i(r2,/f«t4.t5») 

I-inally,  pruning  yields, 

!d(r2,#«t4,ts» 

Evidently,  if  the  proof  II (  for  C(x.y.r,)  had  not  been  given,  there  would  have  been  no 
possibility  of  applying  this  last  pruning  operation.  By  the  same  argument  given  above  for 
point  (1),  this  pruning  can  lead  to  a  large  increase  of  efficiency. 


(>1 


Thus,  although  proofs  of  Harrop  formulas  are  not  required  for  the  execution  of  a  proof, 
they  can  be  used  to  improve  the  analysis  of  dependencies  upon  which  pruning  relies. 
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Chapter  4 


Specialization  of  a  Bin-Packing  Algorithm 

The  experiments  described  in  this  chapter  demonstrate  that  pninable  redundancies  occur 
in  the  "real"  computational  world.  The  experiments  concern  the  specialization  of  the  first-fit 
backtracking  algorithm  for  one-dimensional  bin-packing.  This  algorithm  takes  a  list  Lj  of 
block  sizes  and  a  list  L2  of  bin  sizes  as  input.  Kach  block  and  bin  is  "one-dimensional"  in 
the  sense  that  its  size  is  given  by  a  single  positive  number.  The  algorithm  performs  a  depth- 
first  search  for  a  packing  of  die  blocks  into  the  bins  -  that  is,  for  an  assignment  of  the  blocks 
to  the  bins  with  the  property  diat  the  sum  of  the  sizes  of  the  blocks  assigned  to  any  given  bin 
is  less  than  the  size  of  that  bin.  If  such  an  assignment  is  found,  the  algorithm  returns  that 
assignment  as  its  result,  and  otherwise  it  returns  an  indication  that  no  packing  exists.  The 
algorithm  is  referred  to  as  a  "first  fit"  algorithm  because,  in  the  course  of  search,  it  attempts 
to  place  a  block  in  die  first  bin  in  which  it  firs  as  its  initial  try.  The  bin-packing  problem  is 
well  known  to  be  NP-complete  (Garcy  and  Johnson,  1979],  and  this  particular  algorithm  has  a 
worst  ease  running  time  which  is  exponential  in  the  size  of  the  input.  However,  the  problem 
is  tractable  for  small  inputs.  It  is  of  interest  to  sec  how  much  die  algorithm  can  be  sped  up  in 
the  eases  where  the  inputs  arc  of  feasible  size. 

The  bin-packing  algorithm  was  formalized  as  a  natural  deduction  proof  in  the  first  order 
theory  of  lists  and  numbers,  and  an  untyped  p-calculus  term  was  extracted  from  this  proof. 
The  proof  was  constructed  "by  hand",  but  the  extraction  of  die  p-term  from  the  proof,  and  all 
other  phases  of  the  experiments,  were  carried  out  automatically  by  a  system  of  proof 
manipulation  programs  running  on  the  Stanford  Artificial  Intelligence  Laboratory  PDP-10 
computer.  Several  experiments  were  carried  out,  each  of  which  involved  specializing  the 
algorithm  to  handle  problems  of  a  particular  size  and  structure.  Lor  example,  a  specialized 
algorithm  for  packing  six  blocks  given  in  order  of  descending  size  into  three  bins  of  equal  size 
was  derived  from  the  general  bin-packing  algorithm  by  the  following  steps.  (1)  1'hc  p- 
calculus  term  which  describes  the  general  algorithm  was  executed  (normalized  without 
pruning)  on  the  symbolic  inputs  l.j  =  ^ *  2  =**n.n,n>,  where  the  ij  and  n 
arc  numeric  variables,  and  where  it  was  assumed  further  diat  ij>i2>  •  ■  •  >'6-  The  resulting 
p-calculus  term  had  the  form  of  a  decision  tree.  (2)  The  decision  tree  was  subjected  to  an 
optimization  involving  the  elimination  of  ease  analyses  whose  outcome  was  decided  by 
formulas  already  assumed  on  the  branch  so  far  taken  in  the  tree.  The  optimization  was 
carried  out  by  use  of  the  simplex  algorithm  (all  the  ease  analysis  predicates  in  bin-packing 
have  the  form  of  inequalities  between  sums).  The  process  so  far  could  as  easily  have  been 
carried  out  on  an  ordinary  program  as  on  a  proof  or  p-calculus  term.  However,  at  stage  (3) 
pruning  was  applied.  The  question  of  central  interest  was  this:  what  increase  in  speed  and 
reduction  in  size  would  be  obtained  by  the  application  of  pruning? 
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In  practice,  it  was  not  feasible  to  carry  out  steps  1  and  2  separately,  since  the  decision  tree 
resulting  from  step  1  would  have  been  extremely  large.  Instead,  normalization  and 
optimization  were  applied  "in  parallel"  -  the  decision  tree  was  optimized  in  the  course  of  its 
construction. 

Experiments  of  the  kind  just  described  were  carried  out  for  all  combinations  of  numbers 
n^  of  blocks  and  numbers  112  of  bins  with  2<n2<nj<6.  In  all  cases,  pruning  turned  out  to  be 
a  useful  optimization.  As  an  example,  we  consider  again  the  ease  where  n^=6  and  n2  =  3. 
The  decision  tree  which  results  from  steps  (l)  and  (2)  has  87  decision  nodes  and  a  depth  of 
14.  When  pruning  is  applied,  the  tree  shrinks  to  15  decision  nodes  with  a  depth  of  8.  Thus 
more  than  4/5  of  the  decision  nodes  in  the  decision  tree  resulting  from  steps  (1)  and  (2)  are 
redundant  in  the  sense  recognized  by  pruning.  If  one  measures  the  running  time  of  a  bin¬ 
packing  algorithm  by  the  number  or  comparisons  which  it  makes,  then  the  worst  ease  running 
time  of  the  original  algorithm  on  inputs  of  the  special  form  currently  under  consideration  is 
174.  The  worst  ease  running  time  of  a  decision  tree  algorithm  according  to  this  measure  is 
simply  the  depth  of  the  tree.  Thus  the  simplex  optimization  and  pruning  taken  together 
produce  a  factor  of  improvement  of  nearly  22  in  worst  ease  running  time  (from  174  to  8). 

As  mentioned  in  section  2.8,  pruning  may  have  the  effect  of  changing  the  function 
computed  by  a  proof.  Pruning  docs  in  fact  have  this  effect  in  each  of  the  experiments 
described  in  this  chapter.  Furthermore,  this  effect  is  essential  to  the  success  of  pruning  in 
improving  efficiency.  For  2<n2<nt<4,  the  algorithm  produced  by  pruning  (in  combination 
with  symbolic  execution  and  the  simplex  optimization)  is  both  smaller  and  faster  than  any 
decision  tree  algorithm  which  computes  the  same  function  as  the  original  algorithm.  (This  is 
may  be  true  for  nt  =  5  and  nt  =  6  as  well,  although  this  has  not  been  checked.)  Thus,  no 
collection  of  conventional  optimizations  could  have  produced  specialized  algorithms  for  bin¬ 
packing  which  arc  as  efficient  as  those  produced  by  pruning,  since  conventional  optimizations 
preserve  the  cxtcnsional  meaning  of  the  programs  to  which  they  arc  applied. 

The  following  conclusions  can  be  drawn  from  the  experiments.  (1)  The  simplex 
optimization  with  or  without  pruning  yields  a  large  speed-up  of  die  algorithm.  (2)  Pruning 
dramatically  decreases  the  size  of  the  specialized  dccsion  tree  algorithm,  and  produces  a 
moderate  improvement  in  its  speed  (ie  depth).  (3)  The  improvements  produced  by  pruning 
could  not  have  been  produced  by  conventional  optimizations.  In  the  largest  experiments 
(where  n j  =  6  and  ti2>4),  it  was  not  feasible  to  produce  a  decision  lice  algorithm  at  all 
without  the  use  of  pruning;  pruning  had  to  be  run  in  parallel  with  the  simplex  optimization 
and  normalization  in  order  to  avoid  running  out  of  memory  space.  Thus  in  this  application, 
the  main  practical  effect  of  pruning  was  to  make  possible  the  production  of  fast  specialized 
algorithms  which  arc  of  a  reasonable  size.  In  devising  combinatorial  algorithms  for  handling  a 
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finite  number  of  cases,  speed  is  not  the  only  problem,  since  one  can  often  make  use  of  table 
look-up  to  get  a  very  fast  but  very  large  algorithm.  What  is  difficult  is  to  produce  an 
algorithm  which  is  both  small  and  fast. 

In  what  follows,  we  describe  the  experiments  in  detail.  Section  4.1  concerns  the  system  of 
programs  used  for  doing  the  experiments.  In  section  4.2,  we  describe  the  proof  which 
implements  the  bin-packing  algorithm.  Section  4.3  concerns  the  reductions  on  object  term- 
used  in  normalization.  Section  4.4  gives  the  results  of  the  experiments.  Conclusions  based  on 
the  results  arc  given  briefly  in  section  4.5. 


4.1  The  implementation 

The  system  of  programs  used  for  the  experiments  was  written  in  Mad. ISP,  and  runs  on 
the  Stanford  Artificial  Intelligence  laboratory  POP- 10  computer.  The  system  consists  of  three 
components:  (1)  a  proof  checker  for  natural  deduction,  (2)  a  mechanism  for  extracting 
untyped  p-calculus  terms  from  proofs,  and  (3)  a  normalizer  (with  pruning)  for  the  p-calculus. 
The  proof  checker  is  interactive,  and  allows  the  user  to  specify  the  first-order  language  in 
which  a  proof  is  to  be  given.  In  these  respects,  it  resembles  die  POT  proof  checker 
[Wcyhrauch  1974|. 

The  normalizer,  both  in  internal  design  and  in  function,  is  very  much  like  interpreters  for 
A-calculus  based  languages  such  as  I  ISP[McCarthy  et  al.  l%2j  and  SCHKMHjSussman  and 
Steele,  1975|.  The  execution  of  a  MSP  or  SCIIHMK  program  is  essentially  a  matter  of 
normalizing  a  dosed  A-calculus  term  which  ends  up  with  an  object  term  as  its  normal  form. 
In  the  case  of  SCI  IliM  1%  where  the  static  binding  convention  is  observed,  the  interpreter  has 
exactly  the  effect  of  a  A-calculus  normalizer  when  applied  to  a  closed  term  having  a  "concrete 
value",  whereas  in  most  standard  dialects  of  MSP  (eg  MSP  1.6.  Mac  MSP,  lntcrl.ISP), 
dynamic  binding  holds  sway,  leading  to  a  somewhat  different  behavior  than  normalization. 
In  any  case,  there  exists  a  well  developed  technology  for  efficient  normalization  of  some  kinds 
of  A-terms,  and  this  technology  is  easily  adapted  to  the  task  of  normalization  in  the  p-calculus. 

A  central  element  of  this  technology  is  the  use  of  environments  for  implementing 
substitutions.  Hie  idea  here  is  this.  An  environment  is  an  association 
{(X|.t|).(x;,t2)  .  .  .  (xn,tn)}'  of  terms  with  variable  names.  If  one  wishes  to  evaluate  (or 
normalize)  a  term  which  is  given  as  the  result  of  tj[x«-t,l  of  a  substitution,  then,  instead  of 
doing  the  suhstiluion  first  and  the  normalization  afterwards,  one  normalizes  the  term  t(  in  the 
environment  {(x, (■,)}.  The  normalization  of  a  term  t  in  an  environment  c  is  like  normalization 
of  the  usual  kind,  except  that  variables  which  have  been  assigned  values  in  the  environment 
are  regarded  as  temporary  names  for  those  values.  Most  reduction  rules  applied  in  the  course 


65 


of  normalizing  t  do  not  make  use  of  the  internal  stniclurc  of  the  subterm  which  a  temporary 
name  designates;  on  occasions  when  this  internal  structure  is  relevant,  the  value  assigned  to 
the  name  is  looked  up.  We  won’t  go  into  further  detail  about  how  environments  arc  used; 
the  reader  who  is  unfamiliar  with  these  techniques  should  see  [McCarthy  ct  al,  1962], 

Our  normalizer  uses  environments  in  the  implementation  of  V-rcduclion,  D-rcduction, 
V-reduction,  and  3-reduction.  The  normalizer  resembles  traditional  interpreters  in  the 
additional  respect  that  a  "call-by-valuc"  reduction  order  is  used.  That  is  to  say,  except  for 
terms  whose  main  constructor  is  APPLY.  OH  or  KH,  a  term  t  with  immediate  subterms 
t,  .  .  .  tn  is  normalized  by  first  normalizing  each  tj,  and  then  applying  reductions  to  the 
result.  In  the  ease  of  (1)  APPl.Y(t,,t2,  .  .  .  tn)  ,  (2)  OH.(a.t,.t,.t,).  and  (3)  HK(x.rt,t,.t2),  l,  is 
normalized  first.  If  t,  has  the  form  (l)Xv.t,  (2)  01,(0  or  OI2(t),  (3)  Fl(t.t').  then  (1)  t,  (2)  t2 
or  tj,  (3)  t2  is  normalized  in  the  extension  of  the  current  environment  which  associates  (1) 
4  .  .  .tn  with  Vp  .  .  ,vn,  (2)  t  with  o,  (3)  x  with  t  and  a  with  t'.  If  t(  docs  not  have  the 
appropriate  form  to  allow  a  reduction  rule  to  be  applied,  then  t2  .  .  .  t|(  are  normalized  in 
sequence. 

The  normalizer  is  an  iterative  program  in  the  style  of  the  SCHFMK  interpreter  [Sussman 
&  Steele  1975|.  A  collection  of  (software)  switches  controls  the  mode  in  which  the  normalizer 
operates.  For  example,  the  pruning  reductions  and  the  permutation  operations  can  be  turned 
on  and  off  at  will.  Proof  procedures  (section  2.5)  are  implemented  by  calls  front  the 
normalizer  to  ordinary  LISP  functions.  The  entire  system,  including  the  proof  checker,  the 
extractor,  the  normalizer,  and  a  top  level,  constitutes  about  900  lines  of  Mael.lSP  code,  and 
when  complilcd  occupies  70.000  words  36-bit  words  of  memory.  The  former  figure  includes 
only  the  code  which  was  written  by  the  current  author  specifically  for  the  proof  manipulation 
system.  It  docs  not  include  the  code  contained  in  the  two  "packages"  which  were  imported 
into  system,  namely  a  general  purpose  pretty-printer  written  by  Derek  Oppen  (see  [Oppen. 
1979J)  and  a  simplex  algorithm  written  by  Greg  Nelson.  The  figure  of  70,000  words,  however, 
measures  the  total  amount  by  which  the  size  of  the  proof  manipulation  system  exceeds  that  of 
"bare"  Mael.lSP. 
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4.2  The  proof 


The  bin-packing  proof  used  in  the  experiments  is  formulated  in  a  first  order  language  1^ 
for  numbers  and  lists  of  numbers,  There  arc  two  sorts  of  variables:  variables  which  range 
over  non-negative  integers,  and  variables  which  range  over  lists  of  non-negative  integers. 
(Note  that  the  use  of  sorted  variables  where  the  sorts  arc  disjoint  has  no  effect  whatever  on 
the  treatment  of  proofs  as  computational  descriptions;  in  normalization  and  the  extraction  of 
p-terms,  the  sort  information  may  be  simply  ignored.)  In  what  follows,  lower  ease  letters  arc 
used  for  numeric  variables,  while  capital  letters  are  used  for  variables  which  range  over  lists. 

The  function  and  relation  symbols  of  l-0  arc  listed  below,  together  with  their  intended 
meanings.  Note  that  some  of  the  symbols  arc  given  as  infix  operators.  Any  language 
definition  supplied  to  the  proof  checker  includes  information  as  to  which  binary  function  and 
relation  symbols  arc  to  be  treated  as  infix  operators  by  the  parser  for  formulas  and  terms. 
Our  usage  below  directly  reflects  this  syntactic  part  of  the  formal  defintion  of  lfl. 

symbol  intended  meaning 

+  n  +  m  is  the  sum  of  n  and  m. 

n-m  is  the  result  of  subtracting  m  from  n. 

<  n<m  holds  if  n  is  less  than  in. 

<  n<m  holds  if  n  is  less  than  or  equal  to  m. 

luth  Inth(A)  is  the  length  of  the  list  A 

(tf  n  (o'1  A  is  the  list  which  results  from  adding  n  to  the  front  of  A 

:  A:n  is  the  nth  element  of  A  (it  makes  no  difference  for  our  purposes 

how  A:n  is  defined  for  n=0  or  n  >  Intlt(A)) 

tl  tl(A)  (read  "tail  of  A")  is  the  result  of  removing  the  first  element 

from  the  list  A;  the  tail  of  the  empty  list  is  the  empty  list. 

set  sct(A.n.m)  is  the  list  which  results  from  replacing  the  nth  element  of 

A  by  m.  If  n  =  0  or  n  >  Inth(A)  then  sct(A,n,m)  is  A. 

It  is  most  convenient  to  think  of  1  ()  as  having  just  one  list  constant,  namely  "<>"  for  the 
empty  list,  and  infinitely  many  numeric  constant  symbols:  one  for  each  number.  The  numeric 
constants  (numerals)  arc  represented  in  a  direct  fashion  in  the  implementation,  namely  by 
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I. ISP  numbers.  The  parser  and  the  programs  which  print  out  proofs  and  p-terms  use 
ordinary  decimal  notation  for  numerals. 

We  will  abbreviate  a  term  having  the  form  "tj  @  ^  @  tj  @  .  .  .  tn  @  <>”  by 

'^tl’t2’  •  •  •  I  hus  tj  in  ^tj.tj,  .  .  .  tn^  denotes  the  ith  element  of  the  list  denoted  by 

^tj.tj,  .  .  .  tn>.  The  parser  and  the  output  programs  also  use  this  notation  (in  fact,  the  same 
kind  of  abbreviation  is  used  in  the  internal  representation  of  terms).  Also.  "null(X)"  will 
serve  as  an  abbreviation  for  "X  =  <3>". 

We  can  state  the  one-dimensional  bin-packing  problem  in  the  following  way.  Suppose 
that  we  have  n  blocks  and  in  bins.  F.ach  block  and  each  bin  has  a  particular  size  given  by  a 
positive  integer.  I  .ct  X  =  <?i,.  .  .  .  ip>  be  a  list  of  the  sizes  of  the  blocks,  and  let  B  =  <jj, 
.  .  .  be  a  list  of  the  sizes  of  the  bins.  An  assignment  of  the  blocks  X  to  the  bins  B  will  be 
represented  by  a  list  <k,,  .  .  .  kp>,  where  km  is  the  number  of  the  bin  to  which  the  m1*1 
block  is  assigned.  For  example,  {2,1,1}  represents  an  assignment  of  three  blocks  to  two  bins, 
where  the  first  block  is  assigned  to  the  second  bin,  and  the  remaining  two  blocks  arc  assigned 
to  the  first  bin. 

Now,  an  assignment  A  =  <k,,  .  .  .  kp>  of  blocks  X  to  bins  B  is  legal  if  each  block  of  X 
is  assigned  to  some  bin  of  B  (ie  if  lnth(A)  =  ltuh(X),  and  km  <  Inth(B)  for  each  m),  and  if  the 
sum  of  the  sizes  of  the  blocks  assigned  to  any  one  bin  is  less  than  or  equal  to  the  size  of  that 
bin.  The  one-dimensional  bin-packing  problem  is  this:  given  lists  of  block  sizes  X  and  bin 
sizes  Y.  determine  whether  there  is  a  legal  assignment  A  of  the  blocks  to  the  bins,  and  if  there 
is,  give  it. 

l  hc  algorithm  for  bin-packing  which  is  used  in  the  experiments  is  as  follows,  expressed  as 
an  ordinary  definition  by  mutual  recursion. 

[kick(X, B)  «-  if  null(X)  then  O  else  packb(X. B.l) 

packb(X, B.n)  «-  if  n<!nth(B)  then 

if  X:l<B:n  then 

if  /wcA(tl(X),set(B,n,B:(n  -  X:1)))*FAIL  then 
(n  («’  /x/<A(tl(X).sct(B,u,B:(n  -  X:l)») 
else  /»</eA7i(X,B,n+ 1) 
else  FA  11. 
else  FAIL 


An  informal  explanation  of  the  workings  of  this  algorithm  is  as  follows. 


Ihe  function  /hick  takes  a  list  of  blocks  \  and  bins  H  and  returns  cither  a  legal 
assignment  of  the  blocks  to  the  bins,  or  "FAII .".  meaning  that  there  is  no  packing.  l\uk  first 
checks  whether  there  are  any  blocks  (te  whether  \  is  null).  If  not.  then  the  null  assignment 
will  do.  Otherwise,  the  function  /'<;cA/>  is  called  with  the  bound  n  set  to  1.  In  inickli,\,\\,n)  it 
may  be  assumed  that  \  is  non-empty. 

Ihe  "bounded''  packing  function  /w<A7>  attempts  to  find  a  packing  of  the  blocks  X  into 
the  bins  It  subject  to  a  restriction  on  where  the  first  block  in  X  may  be  put;  namely,  the  first 
blivk  must  be  assigned  to  a  bin  whose  index  is  n  or  greater.  I'ackb  first  checks  whether  n  is 
greater  than  the  length  of  It;  if  this  is  the  case  then  no  packing  which  satisfies  the  given 
restriction  is  possible.  Otherwise.  /mcA/'  checks  whether  the  first  block  fits  in  the  n'*1  bin  (ie 
whether  X :  1  < It. n).  If  the  block  fits,  then  an  attempt  is  made  to  pack  the  rest  of  the  blocks 
into  the  space  which  remains  in  the  bins;  specifically  /wcA(l!(X).!t')  is  called,  where 
It'  sei(H.n.lt:(n  -  X:1)V  If  differs  from  It  in  that  the  si/e  of  the  nth  bin  has  been  reduced 
to  re  lice'  the  assignment  of  the  first  blin  k  to  that  bin.  If  such  a  packing  A  of  tl(X)  into  If  is 
found,  then  n  O'  A  evidently  suffices  as  a  packing  of  X  into  H.  Finally,  if  no  packing  of  tl(X) 
into  If  is  possible,  or  if  X:l  did  not  fit  into  lt:n  in  the  first  place,  then  /wA/>(X,H.n  +  I)  is 
called.  I'hus.  the  end  effect  executing  /x.vA/<X.H.l)  is  that  the  first  block  X:1  is  placed 
succcsively  in  the  first  bin  in  which  tt  fits,  the  second  bin  in  which  it  tits,  and  so  on,  until  a 
placement  of  X :  1  is  found  which  can  be  extended  to  a  complete  packing  of  X  into  If  or  until 
no  bins  are  let). 

Note  that  thcie  are  two  identical  calls  to  /u<  A  in  the  body  of  /ucA/>.  t  his  duplication  of 
effort  could  easily  have  been  eliminated  by  the  use  of  a  \ -abstraction,  but  this  was  not  done 
for  the  sake  of  simplicity  of  presentation.  I  he  duplication  does  not  appear  in  the  bin-packing 
proof. 

file  bin  packing  proof  has  two  pails:  a  "main  theorem"  PACK,  and  a  lemma  PACK II. 
Formally.  "PACK"  and  "PACKH"  aie  to  be  regarded  as  defined  symbols  of  the  language  1  0. 
to  which  "recursive"  proofs  have  been  assigned  according  to  the  rules  given  in  section  .1.4. 
I  hose  proofs  correspond  closely  in  structure  to  the  recursive  definitions  ;mcA  and  /wA7>;  the 
proofs  embody  the  same  analysis  of  eases,  and  the  same  pattern  of  recursive  calls  -  in  short, 
the  same  algorithm  -  as  do  /vcA  and  pucA/*.  I  Mings  of  PACK  and  PACK II  are  given  below 
in  the  form  in  vvlueh  t  ley  were  printed  out  by  the  proof  checker.  I  lie  notation  used  by  the 
proof  checker  is  somewhat  unusual  and  will  be  explained  shortly.  Hat  first,  here  is  the  listing 
of  PACK. 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 
1 

17 

18 

19 

20 
21 
22 


NUl.I.IXX)  mill(\)VCImill(X)) 

AS(»ull(X))  null(X)  2 

AX(  V  lKlcgal(  <>.«»>.  1$)))  V  B(legal(  -O , «  > ,  B)) 

FV(*2)  X=<>  2 

SB(*4.*3(B).2)  legal(O.X.B)  2 

Fl(«>.*5.3A(legal(A.X.B)»  3A(legal(A.X.B))  2 

AS(lni.:i(X))  liuill(X)  7 

PACKB(X,B,1X*7)  3A(BI  A(A..X.B.1))V(13A(BI  A(A.X.B.l)))  7 

AS(3A(B1  A(A.X.B.l)))  3A(B1  A(A.X.B.l))  9 

HV(*9)  3A(logal(A.X.B)A(3mill(X))A(l<(A:l)))  9 

AS(lcgaUA.X.B)A(1nulK\))A(l<(A:l))) 

lcgal(A.X.B)A(3null(X))A(l  <(A:  l))  1 1 

[*IUI|  legal(A.X.B)  11 

FI(A.*12.3A(k'gal(A,X.B)))  3A(lcgal(A.X.B))  11 

F.F(*10.*13.A)  3A(logal(A.:  .B»  9 

Ol(*14.13A(legal(A.X,B)))  3A(lcgal(  \.\.B)>V(33A(logal(A.X.B)))  9 

AS(13A(BI  A(A.X.B.l)))  33A(B1  A(A.X.B.l))  1 

A\(V\  B(3mill(X).13A(Bl  A(A.X.B,m^33A(lcgaKAo.VX.B)))XX.BX*7.M) 

1 3  A  ( legal(  A  <«  3 .  X .  B))  7.1 

01l3AtlcgaKA.X.B)).*17)  3A(logaXA.X.B))V(13A(logaKA.X,B»)  7.1 
OF(*8.*!5.*l8)  3A(logal(A.X.B)>V(33A(logal(A.X.B»)  7 

0 1(  * .  1 3  A(  legal!  A .  X .  B»)  3A(lcgal(A.\.B))V03A(Iogal(A.X.B)))  2 

OF(*  I.*20.*19)  3AUogaHA.\.B))V(33A(logal(A.\.B))) 


XX  IH*2I>  V\  IH3A(lcgul(A.X.B»V(13A(logaKA.X.H))» 


First  wo  comment  on  two  predicate  symbols  which  appear  in  the  above  listing.  The 
formula  legal(A.X.B)  holds  if  A  is  a  legal  assignment  of  the  blocks  X  to  the  bins  B.  With  a 
bit  of  work,  "legal"  can  be  defined  from  the  primitive  operators  and  predicates  of  1  (1  which 
were  given  above,  but  there  is  no  reason  to  do  so  here.  For  the  current  purposes,  "legal"  is 
treated  as  primitive.  I  he  tbimula  Bl  A(A.X.B.n)  holds  if  A  is  a  "legal  bounded  assignment" 
of  the  kind  (hat  ;\u  kl>  might  generate  -  that  is  to  sav  a  legal  assignment  of  a  non-empty  list  X 
of  blocks  to  bins  B  which  assigns  the  first  block  to  a  bin  whose  index  is  at  least  n.  BI.A  is 
used  as  a  defined  predicate,  its  definition  is 

Bl  A(A.X.B.n)  =  legaKA.X.B)A(3nulXX))A(n<(A:D) 
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The  above  listing  should  be  regarded  as  a  "linear"  notation  for  a  natural  deduction  proof 
tree  of  the  kind  discussed  in  chapter  2.  1-ach  line  of  the  listing  designates  a  node  of  the 
tree.  A  line  has  four  pieces  of  information  associated  with  it.  Reading  from  right  to  left, 
these  are  (1)  the  line  number.  (2)  a  term,  (2)  a  formula,  and  (4)  a  list  of  dependencies.  '1’he 
line  numbers  serve  simply  as  unique  labels  which  are  used  to  identify  the  line  in  question. 
The  term  indicates  the  the  sequence  of  inferences  by  which  the  current  line  was  arrived  at 
from  previous  lines  of  the  proof.  The  formula  is  simply  the  formula  associated  with  the  node 
in  the  proof  tree  which  the  line  designates;  it  is  the  conclusion  of  the  inferences  which  have 
been  completed  thus  far.  finally,  die  list  of  dependencies  is  a  list  of  the  line  numbers  of  die 
assumptions  upon  which  the  conclusion  of  the  current  line  depends.  In  the  interactive 
construction  of  a  proof,  the  user  types  a  term  of  the  kind  suitable  for  the  term  part  (2)  of  a 
proof  line;  the  proof  checker  dien  assigns  a  new  line  number  and  computes  the  formula  and 
dependencies  of  the  new  line. 

I  bis  method  of  laying  out  a  proof  tree  in  linear  fashion  is  of  course  quite  standard.  1'he 
only  unusual  aspect  of  the  notation  is  the  m inner  in  which  the  application  of  inference  rules 
is  described.  1  his  information,  as  wo  have  said,  appears  in  the  form  of  the  term  which  is  the 
second  part  of  every  proof  step  or  line;  this  term  resembles  a  p-iertti  in  several  ways,  and  will 
be  called  a  "q-term".  A  q-torni  is  built  up  from  axioms  and  assumptions  and  from  references 
to  previous  lines  by  the  application  of  operators  which  represent  inference  rules.  An  axiom  is 
is  given  by  a  q-term  of  the  form  "A\(q>)“  where  q-  is  the  formula  being  asserted  as  an  axiom 
(see  line  2k  while  an  assumption  lias  the  form  AS(q>)  (see  line  2k  References  to  previous 
proof  steps  take  the  form  of  an  asterisk  followed  by  the  line  number  of  the  step.  The 
operators  which  represent  inference  rules  are:  PAIR  for  A -introduction.  UNFAIR  for  A- 
climitintion,  01  for  V-introducuon,  Ob  for  V-ehmination.  II  for  D-introduction.  APP1  Y  for 
D-elimin.uion  and  V-oliminalion.  A  abstraction  for  V-introduciion,  l-'l  for  3-introduction,  and 
filially  SI!  tor  substitution  (of  "right  for  led").  1'he  syntax  of  q-terms  is  largely  borrowed 
from  the  syntax  which  we  have  been  using  for  p  terms;  for  example  "PAIRfQ.t,)"  is  written 
"<t|.i,v\  Q-terms  differ  from  p  terms  in  the  significant  aspect  that  no  proof  variables  are 
used;  a  q-term  is  no  more  than  a  fragment  of  an  ordinary  natural  deduction  proof  written  in 
applicative  syntax. 

As  a  simple  example  of  a  q  term,  consider  the  term  part  of  line  P*  of  the  proof  PACK, 
which  reads  ”OI  (“  1  *0.*  1*))”.  I h is  designates  the  result  of  applying  V -elimination  to  the 

premises  icpioscuicd  In  lines  l.’i).  and  ll)  tcpoctivoly.  A  more  complicated  example  is  the 
term  part  ol  line  S  \s  usual,  we  use  the  syntax  "i^iX  for  Al’Pl  Y(lj.uk  Al’1’1  Y.  iti  turn  is 
used  to  designate  both  the  V  elimination  and  D -elimination  inference  rules.  1'hus  a  q-term  of 
the  form  t(t,.  .  .  t  )  designates  eitliei  an  V  elimination  or  an  D-eliniination  rule,  under  the 
condition  that  t  is  not  itsolt  an  mteience  uilo  name.  Now.  the  term  part  of  line  8  is 


"PACKB(X,B,1X*7)".  PACKB  is  the  lemma  which  corresponds  to  the  bounded  packing 
function  "packb".  The  endfonnula  of  PACKB  is 

VX  B  n(1nuII(X)D  3A(BI.A(A.X,B.l))V03A(Bl.A(A.X,B.l)))) 

Hie  formula  PACKB(X,B,1)  designates  the  result  of  an  V-eliminalion  with  PACKB  as 
the  premise,  followed  by  an  D-elimination  widi  line  7  as  die  minor  premise.  Thus  two 
inference  rule  applications  are  described  by  line  8  of  the  proof.  In  general,  one  can  record  as 
many  inferences  as  one  desires  in  a  single  line  of  proof  by  the  use  of  a  suitably  complicated  q- 
term:  the  decision  as  to  how  much  information  is  to  be  included  in  each  line  is  a  matter  of 
convenience. 

What  we  have  said  so  far  should  make  at  least  a  rough  understanding  of  the  proof  PACK 
possible.  An  informal  outline  of  the  proof  is  as  follows.  First  of  all.  the  proof  takes  the  form 
of  a  case  analysis  according  to  whether  X  is  null  (sec  steps  1  and  201.  Steps  2  through  6.  and 
step  20.  take  care  of  the  case  where  X  is  null.  If  X  is  not  null,  the  lemma  PACKB  i.»  used 
(step  8).  Steps  9  through  19  are  devoted  to  showing  that 

3A(legal(A.X,B))V(13A(legal(A,X,B))) 

can  be  derived  from 

3A(B1  A(A.X.B.11)V(33A(B1  A(A.X.B.t))) 

1'his  is  done  by  a  case  analysis  (step  19)  according  to  whether  3A(B1  A(A.X.B.D)  is  true. 
The  outer  case  analysis  of  PACK  -  namely  the  case  analysis  according  to  whether  \  is  null  -  is 
reflected  directly  by  the  conditional  expression  "if  null(X)  then  else  packl\X. B.l)"  in  the 
ordinary  recursive  deflution  pack.  However,  the  inner  case  analysis  which  has  just  been 
mentioned  is  necessary  only  in  order  to  demonstrate  that  the  value  returned  by  packl^X. B.l) 
is  also  a  valid  output  for  pack ;  no  counterpart  of  this  case  analysis  is  present  in  the  ordinary 
recursive  delintion. 

Further  information  concerning  the  notation  used  by  the  proof  checker,  and  concerning 
the  proofs  PACK  and  PACKB.  is  given  below.  None  of  this  information  is  of  any  general 
significance:  our  current  purpose  is  to  provide  the  detail  necessary  for  a  full  step-by-step 
understanding  of  the  proofs  PACK  and  PACKB. 

°  One  lemma  other  than  PACKB  appears  in  PACK,  namely  NCI  I  I)  (line  1).  Fite 
"endfonnula"  of  NCI  I  O  is  V\(null(\)V~lnull(\)).  A  proof  procedure  for  NCI  I  F>  is 
supplied  as  part  of  the  normali/er;  NCI  I  0(0  returns  01,(#)  if  t  is  "<>>".  and  OI,(#)  if  t 
has  the  form  "<t,.  .  .  .  tnV  where  »>!„.  Also,  the  lemma  I  TFD  appears  in  PACKB.  The 


cndformula  of  LTKD  is  "Vn  m(n<mVm<n)";  the  proof  procedure  ITFIXtj.tj)  returns 
Ol^#)  if  tl>t2  arc  numerals  with  tj<t2,  and  OI2(#)  if  tltt2  arc  numerals  with  t2<'tf 

°  The  operator  "F.V"  has  the  effect  of  removing  abbreviations  in  the  cndformula  of  a 
proof  -  that  is,  of  replacing  defined  predicates  by  their  definitions.  Two  defined  predicate 
symbols  appear  in  PACK,  namely  "null"  and  "BI.A".  These  symbols  arc  removed  by  HV  in 
lines  4  and  10,  respectively.  F.V  should  not  be  thought  of  as  an  inference  rule,  but  rather  as 
part  of  a  facility  in  the  proof  checker  which  allows  formulas  to  be  given  in  an  abbreviated 
notation;  from  this  point  of  view,  FV  has  the  effect  of  changing  the  external  form  in  which  a 
formula  is  presented  to  the  user  without  changing  the  formula  itself.  F.vidcntly,  uses  of  FV 
could  be  dispensed  with  in  any  proof  simply  by  replacing  all  abbreviations  by  their  definitions 
throughout  the  proof.  The  procedure  which  extracts  p-terms  from  proofs  ignores  uses  of  FV; 
that  is  to  say,  the  term  which  is  extracted  from  "FV(  11)"  is  the  just  the  term  extracted  from 
"H".  Similarly,  the  operator  "FVQ",  which  appears  in  PACKB  but  not  in  PACK,  is  used  in 
conjunction  with  SB  to  introduce  abbreviations.  FVQ  is  applied  to  a  formula  rather  than  a 
proof;  FVQ(<p)  produces  a  proof  step  whose  "formula"  part  is  —  where  4*  is  the 
formula  which  results  from  removing  the  abbreviations  from  <p.  However,  "<p  =  4J"  should 
not  be  regarded  as  a  formula  but  rather  as  another  artifact  of  the  abbreviation  facility.  The 
operator  SB  may  be  used  with  ”<p=:\p’'  as  its  first  premise  in  order  to  substitute  the 
abbreviated  form  <p  for  the  expanded  form  in  the  en formula  of  its  second  premise.  FVQ 
and  SB  arc  used  together  in  this  manner  in  steps  14  and  15,  and  steps  28  and  29,  of  PACKB. 
Again,  these  steps  could  be  removed  by  replacing  all  abbreviations  by  their  definitions 
throughout  the  proof. 

°  There  arc  two  variants  of  the  V -introduction  inference  -  one  puts  the  "new"  disjunct 
on  the  right,  and  the  other  puts  it  on  the  left.  The  corresponding  forms  of  an  application  of 
the  ”01"  operator  are:  (a)  OI(n.F),  and  (b)  OI(F,ll),  where  H  is  a  proof,  and  F  is  a  formula 
(F  is  the  "new”  disjunct).  More  explicitly,  let  us  suppose  that  the  cndformula  of  T1  is  A. 
Then  the  endformulas  of  the  proofs  which  result  from  the  two  forms  (a)  and  (b)  of  OI  will  be 
AVF,  and  FVA,  respectively. 

°  An  application  of  the  "FI"  operator  for  3-introduclion  has  the  form  "Fl(t,n,3x<p), 
where  t  is  a  term  of  I.,  n  is  a  proof,  and  3xg>  is  (of  course)  a  formula.  It  is  assumed  that  the 
cndformula  of  II  has  the  form  <plx  <- 1 J;  otherwise  the  proof  checker  will  reject  this  application 
of  FI.  3x<j>  is  the  cndformula  of  the  result  of  the  application. 


°  In  PACK  and  PACKB  we  make  use  of  the  connectives  "A"  and  "D"  as  operators  of 
arbitrary  arity.  That  is  to  say.  just  as  we  have  allowed  "V"  to  quantify  over  not  just  one,  but 
arbitrarily  many  variables,  we  allow  formulas  of  the  forms  [A[,  A2.  .  .  An  D  B],  and  of  the 
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form  [AjAA2A  .  .  .  An],  where  each  of  these  is  to  be  regarded  as  the  result  of  applying  a 
single  high  arity  connective  "D"  or  "A"  to  A,,  .  .  .  An  and  in  the  first  ease  B.  The  meaning 
of  [Aj,  A2.  .  .  An  D  B]  is  just  (ALAA2A  .  .  .  An  D  B).  The  inference  rules  which  treat  A 
and  D  are  modified  in  a  suitable  way.  Namely,  A-introduction  now  takes  as  many  premises 
as  desired  and  produces  the  conjunction  of  all  the  premises  as  its  conclusion. 
Correspondingly,  one  needs  a  separate  variant  of  A-climination  for  selecting  each  of  the 
conjuncts  of  a  high  arity  conjunction.  The  q-term  notation  for  A-introduction  is 
'Xrij,  .  .  .  nn>".  I'or  A-climination,  we  have  ”[nil[”  to  select  the  first  conjunct,  "[11121" 
to  select  the  second,  "[ril3]"  to  select  the  third,  and  so  forth.  ("[II Ik]"  corresponds  to  " v k" 
in  the  notation  which  we  have  been  using  for  p-terms).  D-climination  also  takes  as  many 
arguments  as  arc  appropriate  to  its  major  premise;  in  q-lerm  notation  we  write 
"n(n,,  n2. . .  .  n  n)"  to  designate  the  application  of  D-climination  to  the  the  major  premise 
n,  and  minor  premises  n(.  n2.  .  .  Iln.  It  is  assumed  here  that  the  endfomiula  of  n  has  the 
form  [A,,  A2.  .  .  A  D  B|,  where  A(,  A2.  .  .  An  are  the  cndformulas  of  ll(,  ll2.  .  .  rin, 
respectively.  1'hc  conclusion  of  this  D-climination  inference  is  B.  I -or  an  example  of  the  use 
of  D-climination  of  arity  2,  see  step  17  of  PACK.  The  use  of  arbitrary  arity  connectives 
constitutes  an  inessential  but  convenient  extension  of  notation. 

A  listing  of  the  proof  PACKB  is  as  follows. 

1  ASOnult(X))  Inull(X)  1 

2  ITKD(n,lnlh(B))  (n  <lnth(B))V(lnth(B)<n) 

3  AS(n<lnth(B))  n<lnth(B)  3 

4  I  IT.I)(X:l.B:n)  (X;l <(B:n))V(B:n<(X;l)) 

5  AS(X:  1  <(B;n))  X;l<(B:n)  5 

6  PACK(tl(X),scl(B,n,B;n-(X:l))) 

3A(lcgal(A,tl(X),set(B,n,B:n-  (X :  1  ))))V 

(1 3  A(  lcgal(  A  ,t  l(X),sct(  B,  n,  B:  n  —  (X ;  1 ))))) 

7  AS(3A(lcgahA.tl(X),sct(B,n,B;n-(X:l))))) 

3A(legal(A,tl(X),set(B,n,B;n-(X:l))))  7 

8  AS(  legal]  A.ll(X  ),set(B,n,B:n  (X:  1 )))) 

lega)(A,tl(X).set(B,n.B:n-(X:l)))  8 

9  AX(VA  X  B  n(|3null(X),n<lnth(B),X;l  <(B;n), 

legal(A,tl(X),set(B,n.B:n— (X;l))) 

D  legal]  u("'A,X,H)|) 

VA  X  B  n([3nuil(X),n<lnth(B),X: I  <(B;n), 
legal]  A. tit  X  ).sct(B,n.B:  n  -  (X :  1 ))) 

D  lcgal(n(«!A,X.B)|) 

10  *9(A,X,B,n)(*l,*3,*5,*8)  legal]  n(«'A.X,B)  1, 3.5.8 

11  AX(Vn  A(n<(n(i('A:l)))  Vn  A(n<(n("  A:  1)) 
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n  *iun.  a) 

13  <*10.*  l.*12> 


n<(n(<i'A:  1) 

logal(n(<i'A.X.ll)A(1milll\))A(n<(n(«’A:l)) 


14 


15 

l<> 


17 

IS 


14 

:o 


21 


23 

24 

25 

37 

AS 


30 

31 
3.’ 


35 


8,5.3, l 


HYQ(Bl  A(n(H'A.X.H.n)) 

HI  A(n(</!A,X,B.n)- 

(lcp.>l(n(<i\\.X.H)A(1mill(X))A(n<(nC^A:l))) 
SHI  (*14.*13)  HI  A(n(<i'A.X,H.ii,  8.5.3.1 

l;.l(<n('i'A>.*15.3A(lll  A(A.X.H.n)l) 

3A(HI  A(A.X.H.n))  8.5.3.1 

l'l'(*7.*t*)  3A(1)I  A(A.X.H.n))  5.3, 1.7 

OI(*l7.13A(HI  A(A.X.H.n)l) 

3A(HI  A(A.X.H.n))V(33A(HI  A(A.X.H.n))) 

5.3. 1.7 


AS(33A(loi’,al(A.tl(X).sot(H.n.H:n  (X:!))))) 

33\(loj>.il(\.ll(X).scKH.ii.H:n  (X:l)»)  14 


1’ AC 'K H(X.H.n  (■  1) 


3null(X)D 

3A(1H  A(A.\,H.n  t  1))V(13A(HI  A(A.X.H.iH  1))) 


*;o(*i) 


3A(HI  MA.X.H.n  i  111V03MHI  A(A,\.H.n  ♦  1))) 


AS(3A(Hf  A(A,\.H.u  f  111) 

3A(H1  MA.X.H.n  I  0) 

AS(HI  \(A.X.H.n  I  111  HI  MA.X.H.n  I  I) 

1\\*.'31  loj'..il(A.\.H)A(3null(\llA(n  1  1<(A:1)) 

A\(Vn  m(n  M  <m "In  <ml) 


1 

32 

23 


23 


Vn  m(n  I  KmDn<ml 
s|*74il|.l*3447|.*75(n.A:lH|*AU3|)> 

lcj’.aU\.\.HlA(3inilHXl)A(n<(A:l))  23 

1  \  (,>(HI  A(A.X.H.nl)  HI  A(A.X.H.n) 

(U'j'.al(A.X.HlA(1null(X»A(n<(A:l)l) 


SHI  (*2S.*27)  HI  A(A.X.H.n)  23 

1  l(  \.*2‘).  1A(HI  .MA.X.H.n)))  JA(ltl  A(A.X.H.n))  23 

I  I  (VV.MO)  3. \(IH  A(A.X.H.n))  22 

OI(MI.3  1A(HI  A(A.\,H.n))) 

3A(HI  A(A.\.H,u)lV(13A(HI  A(A.X.H.n))) 

22 


AS(3  I  A(HI  MA.X.H.n  I  111) 
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LEM0(X,n,nK*19.*35) 

OI(3A(BLA(A,X,B,n)),*36) 


OF(*21,*32,*37) 

0F(*6,*18,*38) 

AS(B:n<(X:l)) 
PACKB(X,B,n  +  l) 

AS(33A(BI.A(A,X,B,n+l))) 

l.FMl(X,B,n)(*l,*40.*43) 
OI(3A(BI  A(A,X,B,n)),*44) 

OH(*42,*32,*45) 

OF(*4,*39,*46) 

AS(lnth(B)<n) 

l.FM2(X,B,n)(*48) 

01(3  A(B!  .A(A,X,B,n)),*49) 

OF(*2,*47,*50) 

ll(3null(X),*51) 

XX  B  n(*52) 


33A(BLA(A,X,B,n+ 1)) 
33A(BLA(A,X.B,n)) 

3A(BLA(A,X,B,n))V(13A(BLA(A,X.B,n))) 
3A(BI.A(A,X,B,n))V(73A(BI  ,A(A,X,B,n))) 
3A(BFA(A,X,B,n))V(33A(BI.A(A,X,B,n))) 
B:n<(X:l) 


35 

19,35 


19,35 

1,19 

5,3,1 

40 


lnull(X)D 

3A(BFA(A,X,B.n+  l))V(33A(BLA(A,X,B,n  + 1))) 
3A(BFA(A,X,B,n+  l))V(13A(BI.A(A,X,B,n  +  1))) 

1 


33A(BFA(A,X,B,n+l)> 

33A(BLA(A,X,B,n)) 

3A(Bl.A(A,X,B,n))V(33A(BLA(A,X,B,n))) 

3A(BI.A(A,X,B,n))V(13A(BI.A(A,X,B,n)» 

3A(BI.A(A,X,B,n))V(33A(BI.A(A,X,B,n))) 

lnth(B)<n 
33 A(BI  A(A,X,B,n)) 

3A(BI  A(A,X,B,n))V(33A(BI.A(A,X,B,n))> 
3A(B!  A(A,X,B,n))V(33A(BFA(A,X,B,n») 


43 

1,40,43 


1,40,43 

1,40 

3,1 

48 

48 

48 

1 


1mill(X)D3A(Bi.A(A,X,B.n))V(33A(Bl.A(A,X,B.n») 
VX  B  n(3mill(X)D 

3A(BI.A(A,X,B,n))V(33A(Bi.A(A,X.B,n)))) 
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4.3  Reduction  rules  for  terms  of  L 


The  following  special  purpose  reduction  rules  for  terms  of  L0  are  provided  as  part  of  the 
nonnalizer  (reductions  on  object  terms  were  discussed  in  section  2.6).  We  will  not  belabor  the 
distinction  between  numerals  and  numbers;  for  example  we  will  allow  ourselves  to  use  the 
phrase,  "the  sum  of  tt  and  t2",  instead  of  the  more  precise  phrase  "the  numeral  which  denotes 
the  sum  of  the  numbers  which  tt  and  t2  denote"  in  the  case  where  t2  and  t2  are  numerals. 
However,  special  variables,  namely,  a,b  and  c,  will  be  used  for  numerals. 


!  f 


(  iMfe 


"a  -  b" 


"c",  where  c  is  the  sum  of  a  and  b. 


"c",  where  c  is  the  result  of  the  indicated  subtraction. 


"Inthf^tj.tj,  .  .  .  tn>)”  =>  t'  where  t'  is  the  numeral  for  n. 

tl(«tL,t2,  .  .  .  tn»  =>  4^,  .  .  .  tn> 

<t„u,  .  .  .  t  >:a  =*■  t ,  under  the  condition  that  l<a<n. 
v  r  n  a’  —  — 


set(<t,,t2,  .  .  .  tn>,b,t0)  t',  where  one  of  the  following  conditions  holds:  (a)  l<b<n 

and  t'  is  the  result  of  replacing  t^  in  <^t1,t2,  .  .  .  tn>  by  tQ,  (b)  b  =  0  or  b  >  n,  and  t'  is 
<t,.t2.  •  -  -  tn>. 

For  example,  these  reduction  rules  would  have  the  effect  of  reducing  the  term 
"<3,4  +  5>:2"  to  the  term  "9".  It  is  not  hard  to  sec  that  normalization  of  any  term  of  1^ 
with  respect  to  these  rules  will  terminate,  and  that  the  normalization  of  any  closed  term  will 
yield  cither  a  numeral,  or  a  term  of  the  form  ^t^,  .  .  .  tn> where  the  tj  arc  numerals. 


4.4  Results 

The  results  of  the  experiments  will  be  presented  in  several  stages.  The  p-terms  which 
were  extracted  from  the  proofs  PACK  and  PACKB  will  be  given  in  section  4.4.1.  In  section 
4.4.2,  the  results  of  the  smallest  of  the  experiments  arc  given  in  full  detail,  and  the  simplex 
optimizations  arc  described.  Section  4.4.3  presents  the  optimized  algorithm  for  packing  six 
blocks  into  three  bins.  ■  Finally,  section  4.4.4  tabulates  the  results  of  the  remaining 
experiments. 
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4.4.1  P-terms 


The  following  p-term  was  extracted  from  PACK: 


ppack  = 


X  X  B 
(OE(a2' 

NULLIXX) 

OI(l,EI(SB(«2,#(B)),«>)) 

(OE(a4) 

PACKB(X,B,l)(a2) 

01(1. EH  (a6  A)  a4  EI([a64  l],A)) 
OI(2,#(X,B)(a2,a4)))) 


The  notation  used  for  p-terms  in  the  implementation  differs  in  several  minor  ways  from 
the  notation  which  we  have  found  it  convenient  to  use  in  our  exposition  of  the  p-calculus  in 
chapter  3.  (1)  In  the  implementation,  we  write  ”OI(l,t)"  and  "01(2,0"  instead  of  "01. (t)"  and 
"OI2(0".  (2)  As  explained  in  the  last  section,  wc  now  allow  "pairing"  operators  of  each 

positive  arity;  arbitrarily  many  terms  tj  .  .  .  tn,  can  be  "tupled"  together  into  the  term 
<t,  .  .  .  tn>.  Correspondingly,  there  is  a  projection  operator  n rk  for  each  positive  integer  k. 
Instead  of  writing  "wk(t)"  wc  write  "(lik)".  Note  that  k  must  be  a  numeral,  [Ul],  [tJ-2],  .  .  . 
arc  to  be  regarded  as  notations  for  separate  elementary  operators  of  the  p-calculus.  ("4"  is 
not  a  function  symbol!)  Wc  remark  once  more  that  the  use  of  arbitrary  arity  tupling  instead 
of  iterated  pairing  is  no  more  than  a  notational  convenience.  (3)  The  order  in  which 
arguments  to  the  operator  "HI"  appear  is  reversed;  a  p-term  "l-.I(t ],t2)’’  as  expressed  in  the 
notation  of  chapter  3  is  written  as  ”HI(t2,t,)"  in  the  notation  of  the  implementation,  lints,  in 
a  construction  "EI(t|,t2):3x<p"  in  the  new  notation,  t[  is  the  construction  for  <p(t?),  and  not  the 
other  way  around.  (4)  The  numbers  which  play  the  role  of  subscripts  to  variables  appear 
simply  to  the  right  of  the  variable  name  rather  than  to  the  right  and  below  the  variable  name. 
Thus,  wc  write  "il",  "i2",  "al",  "«2"  and  so  forth,  instead  of  "i|",  "i2",  "a|",  "(*{'•  (The 
reason  for  this  change  is  that  the  text  of  the  various  p-terms  given  below  was  derived  directly 
from  the  output  of  the  proof  cheeking  system;  such  output,  for  practical  reasons,  docs  not 
make  use  of  subscripts.  The  output  was  produced  in  indented  form  by  use  of  Derek 
Oppen's[1979]  pretty-printer.) 
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The  p-tcrm  extracted  front  PACKB  is: 

ppackb  = 

XX  Bn 
(X  «2 
(OH  («4) 
l.TF.D(n,lnth(B)) 

(OH  (a6) 

LTHD(X:l,B:n) 

(OH(a8) 

PACK(tl(X),sct(B,n,B:n-(X:l))) 

01(1. 

EH  (a  10  A) 
a8 

HI(SB(#, 

<  #(A,X,B,nX«2,a4,«6,alO),a2, 
#(n,A)»,n@  A)) 

(OH(«12) 

PACKB(X,B,n  +  l)(a2) 

01(1. 

HH (a 14  A) 
o!2 

EI(SB(#, 

<(«  144  l},[a  1442], 

#(n,A:  l)([a1443])»,A)) 

01(2,  tt  (X,B,n)(o8,a  12)))) 

(OH  (a  16) 

PACKB(X,B,n  +  1)(«2) 

0((I. 

HU  (a  18  A) 

«16 

Hl(SB(#, 

<!«1841J.[al842].^(n.A:lXl«18*3])», 

A)) 

01(2,  #(X.B,n)(a2,a6,tt  !(>)))) 
Ol(2,#(X,B,n)(«4)))) 


flic  system  of  lemmas  ppack  and  ppackb  has  the  termination  property  with  respect  to  our 
call-by-valuc  normalizer:  this  can  be  established  by  exactly  the  same  kind  of  argument  as 
would  be  used  to  establish  the  termination  of  the  ordinary  recursive  functions  pack  and  packb. 


l  et  t(  and  t2  be  closed  terms  for  lists.  By  theorem  3.1  of  chapter  3,  the  result  of 
normalizing  "ppack(t|,t2)"  has  one  of  the  two  forms,  MOI(l,HI(tj,l4))",  and  "01(2, ts)".  A 
result  of  the  form  "OI(l,H!(tj.t4))"  may  be  read  as  the  term  part  of  a  construction 

01(l,HI(t3:legal(t1.t2.t4).t4):3A(lcgal(l,,t2.A))):3A(legal(t1.t2,A))V1(3A(legal(t1>t2.A))) 
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This  result  indicates  that  a  legal  assignment  of  blocks  t,  into  bins  t2  docs  indeed  exist,  and 
that  t4  is  such  an  assignment.  If  on  the  other  hand  the  result  has  the  form  ”01(2, ts)",  then 
no  legal  packing  is  possible. 


4.4.2  Two  small  experiments 

As  indicated  in  the  introduction  to  this  chapter,  the  experiments  consist  of  specializing 
ppack  to  handle  inputs  of  a  particular  size  and  structure  by  means  of  the  following  steps.  (1) 
The  term  "ppackft^tj)"  is  normalized.  Here  t(  and  t2  are  open  terms  having  the  form  of  the 
special  inputs  to  be  treated;  namely,  <i,.  .  .  .  ik>,  and  <?n.n  .  .  n>,  where  "ij",  .  .  .  "ik",  and 
"n"  arc  numeric  variables.  (2)  The  normal  form  of  "ppack(t|.t2)"  is  subjected  to  the 
"simplex”  optimization,  which  makes  use  of  an  additional  assumption  about  the  structure  of 
the  inputs;  in  particular,  it  is  assumed  that  i,  >  i2  >  i} .  .  .  >ik.  (.1)  Pruning  is  applied.  'Hie 
result  of  all  this  is  a  decision  tree  algorithm  (given  by  a  p-term)  for  the  special  task  of  packing 
k  blocks  into  some  particular  number  of  bins  of  equal  size,  under  the  assumption  that  the 
blocks  have  been  given  in  decreasing  order  of  size. 

To  begin  with,  we  will  describe  the  results  of  this  process  for  the  simplest  ease  which  is 
not  absolutely  trivial,  namely  the  case  where  t,  =  <il.i2>.  and  t2  =  <n,n>.  1-irst  of  all,  the 
result  of  normalizing  ppack(<£il,i2?»,<n,n>)  is: 


Pl  = 


OH  («7) 
ilH.lXil.n) 

(OH(«9) 

i:iHlXi2.n-il) 

Ol(l,HI(#(a7,a9).«l,l>)) 

(OH  (all) 

ITKIXi2.n) 

01(l.Hl(#(a7,all).«1.2>)) 

(OH  (a  13) 

ITHIXil.n) 

(OH  (a  15) 

1  l'HlXi2.n) 

01(U:i(#(al3.al5).«2.15*>)) 
(OH.  (a  17) 
llH.lXi2.n-il) 
OI(l,HI(#(al3.al7),«2,2>)) 
Ql(2,  # (a  l  !,a9,«  !5,a  !7)))) 
01(2,#  («l  I,a9,al3))))) 

(OH  (a  19) 

llHJXil.n) 

(0H(a21) 

1  l'H'IXi2.n) 

OI(l.Hl(#(al9.a2l).<sS2,l>)) 

(OH(a23) 

1  THI)(i2.n-il) 
OI(l,HI(#(rtl9.a23).«2.2»)) 
Of(2,#(«7,«2t,a23)))) 
OI(2.#(«7.al9))) 


1'his  p-term.  if  written  as  an  ordinary  conditional  expression,  would  read: 


if  il<n  then 
if  il  +  i2  <n  then  <31,1> 
else 

ifi2<nthen  <1,2> 
else 

if  il<n  then 
ifi2<nthcn  <2.1^ 
else 

if  il  -H2  <n  then  <2.2>  else 
else  <  > 
else 

if  il<n  then 
if  i2 <n  then  <32,1> 
else 

if  il  +  i2<n  then  <<2.2>clseO 
else  <  > 
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T*~-»  's:  «< * 'm  K  . 


men  w>n  ix 


The  above  conditional  expression  would  also  result  from  normalizing  the  ordinary 
recursive  function  definition  pack  on  the  symbolic  inputs  <il,i2^>  and  <n,n>  using  the 
reduction  rules  mentioned  in  section  2.8,  and  in  addition  the  permutation  rule: 


if  (if  tt  then  t2  else  t3)  then  t4  else  tt!5 

=>  if  t1  then  (if  t2  then  t4  else  tt!5)  else  (if  t3  then  t4  else  tt!5) 

Now,  the  "simplex  optimization"  consists  of  removing  "prc-decidcd"  case  analyses. 
Another  transformation  is  applied  at  this  stage,  namely  the  replacement  of  occurences  of 
assumptions  when  possible  by  "proofs”  of  those  assumptions  from  other  available  information. 
This  last  transformation  improves  the  effectiveness  of  pruning,  since  it  removes  apparent  but 
in  fact  unnecessary  dependencies  between  the  facts  involved  in  the  computation.  Since  all  of 
the  decision  predicates  which  appear  in  bin-packing  take  the  form  of  inequalities  b  tween 
linear  terms,  the  simplex  algorithm  may  be  used  to  perform  these  transformations. 


The  "simplex  transformations"  arc  instances  of  the  following  general  replacement 
transformation  on  proofs.  First  of  all,  we  define  the  set  of  active  assumptions  at  a  node  of  a 
proof  tree  to  be  the  set  of  assumptions  discharged  along  the  path  from  the  node  in  question  to 
the  root  node  of  the  proof.  More  formally,  a  formula  A  is  active  at  a  node  N  if  N  lies  in 
[using  q-temi  notation]:  (1)  IF,  of  01X11 1:AVF,n2.l'I3).  (2)  Il3  of  OF(nl:FVA,n2,n3)  (3)  fl 
of  1I(A,II),  (4)  11 2  of  EF,(H|,3xA.fl2).  A  replacement  transformation  is  a  transformation 
which  replaces  a  subproof  II  :  A  rooted  at  node  N  of  a  proof  II  by  another  proof  1 1": A  of  the 
same  formula  A,  subject  to  the  condition  that  the  open  assumptions  of  II"  arc  among  those 
active  at  N. 


The  simplex  transformations  arc  replacement  transformations  of  a  special  kind.  Consider 
a  subterm  of  the  form  "I.ITO(t|.t2)”  which  appears  in  a  bin-packing  p-term.  Suppose  that 
one  of  tt<t2  or  t,<t,  follows  from  the  active  assumptions  at  the  node  at  which  lTKIXtj.t,) 
appears  (all  of  the  active  assumptions  will  themselves  be  linear  inequalities).  That  is  to  say, 
suppose  th.it  the  outcome  of  executing  HTO(t|.l>)  is  prc-decidcd  by  the  linear  inequalities 
which  have  already  been  assumed  at  the  current  node  in  the  decision  tree.  Then  the 
invocation  of  the  lemma  I.TFD  can  be  removed  in  favor  of  a  small  proof  of  "t|<t2Vt2<t,"  by 
means  of  an  V  introduction  from  one  or  the  other  of  the  results  "t,<t2",  or  "t2<t,". 
Specifically,  that  proof  will  have  the  form 
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OI(k.AX([Fj,F2. . . .  F„  D  F0|HAS(I-,).AS(K2) . . .  AS(Fn))) 

where  k  is  either  I  or  2,  where  Ffl  is  either  t,<t2  or  t,<tj.  and  where  Fj.Fj.  .  .  .  F  are  the 
various  inequalities  which  are  active  assumptions  at  this  point  in  the  decision  tree,  and  which 
are  needed  to  conclude  that  F'0  holds.  1'his  is  exactly  the  replacement  which  is  performed  by 
the  first  simplex  transformation  -  except  that  the  replacement  is  carried  out  in  the  language  of 
untyped  p  terms;  thus  the  replacing  term  has  the  form  OI(k.#(a  j.  .  .  .«*  )),  where  the  (ti  are 
proof  variables. 

Now.  let  us  consider  the  second  transformation  -  the  dependency  removal  transformation. 
Suppose  that  an  assumption  AS(F())  in  a  specialized  bin-packing  proof  follows  from  other 
assumptions  which  are  active  at  the  node  where  the  assumption  appears.  Then  the  various 
results  which  arc  derived  using  the  assumption  have  the  appearance  of  depending  on  that 
assumption,  but  the  dependency  is  in  a  sense  unreal  -  it  could  be  dispensed  with.  If  we  wish 
to  make  the  best  use  of  pinning,  then  apparent  dependencies  of  this  kind  should  be 
eliminated.  So  we  use  the  simplex  method  to  replace  assumptions  AS(F'0)  by  proofs  of  those 
assumptions  from  other  assumptions  which  are  currently  in  effect.  The  form  of  the  proofs 
with  which  assumptions  are  replaced  is 


I  < 

u 

i  i 


L  i 


A\((I  ,,F,. . .  .  l  n  D  F0|KAS(F,),AS(l-j) . . .  AS(I  ,,)) 


where  F,.F,.  .  .  .  F|(  are  the  formulas  needed  to  establish  F(V  In  p-term  notation,  this  has  the 
form  Note  that  the  formulas  which  are  associated  with  proof  variables  in  the 

bin-packing  p- terms  can  be  determined  by  finding  the  OF  operator  which  binds  the  variable, 
and  looking  at  its  first  argument  ”1  T'l*l'Htl.t,>":  if  the  variable  in  question  appeal's  in  the 
second  premise  to  this  OF!  operator  then  the  associated  formula  is  "t^t,".  and  otherwise  it  is 


One  more  piece  of  information  remains  to  be  specified  about  the  simplex  transformations. 
It  may  happen  that  several  distinct  proofs  can  be  used  to  replace  a  single  assumption  or 
"I  ITir  invocation:  the  inequality  in  question  might  follow  from  several  different  subsets  of 
the  the  currently  active  sel  of  assumed  inequalities.  We  have  not  said  how  a  choice  among 
several  such  possibilities  is  to  be  made.  In  fact  only  one  possibility,  namely  the  one  generated 
by  the  following  algorithm,  is  considered.  I  et  F, .  .  .  F"  be  a  list  of  all  the  assumptions 


active  at  a  given  node  in  the  order  of  "innermost”  to  "outermost":  that  is,  F,  is  the 


assumption  discharged  nearest  the  current  node,  while  F'  is  the  assumption  discharged  nearest 
the  root  of  the  proof.  In  attempting  to  find  a  minimal  subset  of  {F'j,  .  .  .  F  }  from  which  a 
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formula  Ffl  can  be  derived,  our  algorithm  proceeds  in  the  following  way.  First  it  checks 
(using  the  simplex  method)  whether  FQ  follows  from  {F,}.  If  not  then  it  checks  {F,. 
{Fj.Fj.Fj},  and  so  on.  until  it  finds  a  least  j  such  that  F„  follows  from  {Fj,  .  .  .FJ,  or  until  it 
is  determined  that  Ffl  does  not  follow  from  the  entire  set  {IJ,  .  .  .  Fn}.  In  the  latter  ease,  we 
arc  done,  and  return  a  negative  answer.  In  tine  former  ease,  we  scan  through  the  set  again,  in 
the  order  F(,  .  .  ,Fn  in  which  it  is  given.  For  each  element  Fj  considered,  an  attempt  is  made 
to  remove  F,  from  tine  set;  the  attempt  is  deemed  successful  if  the  reduced  set  still  implies  F'c 
After  removing  or  attempting  to  remove  each  If  in  turn,  we  evidently  have  a  mininimal  set  of 
inequalities  with  tine  desired  property.  It  is  this  set  which  is  returned  by  the  algorithm,  finis 
algorithm  was  the  first  that  came  to  mind.  and.  because  it  produced  good  results,  we  did  not 
try  another. 


In  each  of  the  simplex  transformations,  the  inequalities  i(  >  i,.  i,  >  i,.  .  .  .ik.(>ik,  are 
assumed  as  "background"  information.  That  is  to  say,  whenever  we  used  the  phrase  "F() 
follows  from  {F,  .  .  .  FJ"  in  the  above,  we  meant  "F()  follows  from  {F,  .  .  .  F  }  and 
{i(  >  F  >  J.  •  •  -'k-i^'J"- 

Note  that  the  only  property  of  the  bin-packing  proofs  of  which  simplex  transformations 
intake  special  use  is  the  fact  that  the  decision  predicates  have  the  form  of  linear  inequalities. 
Transformations  of  the  same  kind  -  namely,  replacements  of  case  analyses  and  replacement  of 
assumptions  -  can  be  applied  to  any  proof  under  the  condition  that  a  decision  procedure  is 
available  for  tine  case  predicates  which  appear  in  the  proof.  Thus,  the  special  purpose  part  of 
tine  simplex  transformations  is  just  the  simplex  algorithm  itself. 


As  mentioned  earlier,  the  simplex  algorithm  used  in  the  implementation  was  not  written 
by  the  current  author.  Rather,  a  "canned"  simplex  package,  written  (in  Mac  I  ISP)  by  Greg 
Nelson,  was  imported  into  the  proof  manipulation  system. 


fine  result  of  applying  the  simplex  transformations  to  the  normal  form  of 
"pack(<it.i.'*\<n.n$')"  given  earlier,  and  then  normalizing  again  is: 
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P  2  = 


OH  (o25) 
l.TKDtfl.n) 

(OH.  (all) 

I  I  HlXi2.ii  — il) 
OI(l.HI(#(a27),«l.l») 
Ol(l.HI(#(a25).«1.2H» 
OI(?  #(<*25)) 


Written  as  an  ordinary  conditional  expression,  this  is: 


e,  =  if  il<n  then  (if  il  +  i2<n  then  else  <?1.2>)  else  HAIL, 

Note  that  the  first  of  the  two  simplex  optiirh'ations  -  namely,  the  one  which  removes  pre- 
decided  case  analyses  -  could  as  easily  have  been  applied  the  conditional  term  c,,  and  the 
result  would  have  been  e2.  Thus,  so  far.  no  use  has  been  made  of  the  additional  dependency 
information  which  the  p-term  contains,  but  which  the  conditional  term  does  not.  However, 
pruning  is  applicable  to  p2,  yielding: 

p3  =  OH  («2l>)  I  I  HO(i l.n)  01(1.1  l(# (<»29),<1, 2>))  01(2, #(«29)) 

Written  as  a  conditional  expression,  this  is: 


Cj  =  if  il<n  then  <1,2^  else  FAII 

Thus  p(  tries  only  one  packing,  namely  <1,2>.  If  any  packing  works,  then  this  one 
must.  This  fact  is  "automatically  realized"  by  the  dependency  analysis  involved  in  pruning. 


Note  that  p,  computes  a  different  function  from  that  computed  by  p2.  Also  note  that  p2 
is  the  optimal  (ie  smallest  and  fastest)  conditional  expression  for  o> mputing  the  function  Xil  i2 
n.  pack(^il.i2>.<n.n>)  with  il>i2.  Thus,  it  is  only  by  using  a  transformation  (such  as 
pruning)  which  modifies  the  cxtensional  meaning  of  computational  descriptions  that  we  arc 
able  to  achieve  the  improvement  which  p,  represents  over  p2. 


As  mentioned  in  the  introduction  to  this  chapter,  it  is  not  feasible  to  perform  stages  (1) 
and  (?)  of  the  specialization  separately  for  the  larger  examples.  The  reason  for  this  is  that  the 
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p-tcrms  which  result  from  stage  (1)  alone  arc  too  large  to  fit  in  memory  in  the  current 
implementation.  Thus  the  simplex  transformations  and  normalization  were  run  in  parallel;  the 
normali/cr  was  modified  so  as  to  apply  the  case  analysis  removal  procedure  when  called  upon 
to  "normalize"  an  expression  of  the  form  "l.TKlXt,.^)".  Assumption  replacement  was 
implemented  in  a  similar  manner. 

We  now  present  the  results  of  another  small  experiment,  namely,  the  experiment  in  which 
pack(4il.i2,i3>,<n.n>)  is  specialized.  First  of  all,  the  worst  case  running  time  of  the  original 
version  of  pack  (or  equivalently  of  ppack  with  our  call-by-value  normali/er)  with  il>i2>i3  is 
10.  where  running  time  is  measured  in  number  of  comparisons.  More  precisely,  there  are 
numerals  a.b.c.d  with  a>b>c  such  that  the  number  of  comparisons  made  in  die  course  of  the 
execution  of  pack(‘^a.b.c^.<d.d5v)  by  a  standard  call-by-valuc  evaluator  for  conditional 
expressions  is  10.  and  furthermore  this  is  the  largest  number  of  comparisons  which  will  be 
made  in  any  execution  of  pack  applied  to  an  input  with  this  form.  The  worst  case  running 
times  for  pack  reported  here  and  below  were  computed  using  a  program  which  searches 
through  all  possible  execution  paths  (ie  sequences  of  comparisons)  of  pack  when  applied  to  an 
input  of  the  special  form  under  consideration;  the  length  of  the  longest  such  path  is  returned. 
I  he  simplex  algorithm  is  used  tv'  determine  which  execution  paths  are  possible,  and  w  hich  are 
not. 

The  result  of  normalizing  and  applying  the  simplex  transformations  to 
pack('4il,i2.i.l>Sn.n>)  is: 
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OE(all) 

LTED(il,n) 

(OE  (a  13) 

LTED(i2,n-il) 

(OE(al5) 

LTED(i3,n — il — i2) 
OI(l,EI(tf(al5),«l,l,l>)) 

01(1  ,EI(  it  (al3),  <  1 ,1,2>))) 
(OE(al7) 

LTED(i3,n-il) 

OI(l,El(#(all,al7),<l,2,l>)) 

(OE(al9) 

LTED(i3,n-i2) 

Ol(l,El(#(all,al9),<l,2,2*<)) 

OI(2,#(al9)))» 

OI(2,#(ali)) 


Paining  when  applied  to  the  above  p-term  yields 


OE(«21) 

LTED(il.n) 

(OE(«29) 

!.TEIXi3,n-i2) 

01(1 ,  EI(  #  (a21,o29),  <  1,2,2>)) 
Ol(2,#(«29))) 

OI(2,#(a21)) 


Written  as  an  ordinary  conditional  expression,  this  is: 


if  il  <  n  then 

ifi2  +  i3  <  n  then  <1,2, 2>  else  FAIL 
else  FAIL 


Note  that  pnining  again  yields  an  optimal  algorithm  for  the  special  ease  considered  -  an 
algorithm  which  computes  a  different  funticon  from  that  originally  computed  by  pack. 


4.43  An  algorithm  for  packing  six  blocks  into  three  bins 

The  results  of  the  experiment  concerning  the  packing  of  six  blocks  into  three  bins  were 
described  in  general  terms  in  the  introduction  to  this  chapter.  The  end  product  of  that 
experiment  -  that  is  to  say,  the  algorithm  produced  at  the  last  stage  of  the  three  stages  of 
optimization  -  is  given  below  as  an  ordinary  conditional  expression. 


if  il  <  n  then 

if  i2+i3  <  n  then 

if  il  +  i6  <  n  then  <1.2,2,3,3,1> 
else 

if  i4+i5+i6  <  n  then  <1 ,2,2,3,3,3> 
else  FAIL 
else 

if  i2+i4  <  n  then 

if  il  +  i6  <  n  then  <1,2,3,2,3,1> 
else 

if  i3  +  i5  +  i6  <  n  then  <1,2.3.2,3,3> 
else  FAIL 
else 

if  i3-H4  <  n  then 
if  i2+i5  <  n  then 

if  il  +  i6  <  n  then  <1,2, 3.3, 2, l> 
else 

if  i2  +  i5  +  i6  ^  n  then  <1,2,3,3,2,2> 
else 

if  i3  +  i4+i6  <  n  then  <1,2,3,3,2,3> 
else  FAIL 
else 

if  i3  +  i4  +  i5  <  n  then 

if  i6+i2  <  n  then  <1, 2,3,3, 3,2> 
else 

if  i3  +  i4+i5  +  i6  <  n  then  <1,2,3,3,3,3> 
else  FAIL 
else  FAIL 
else  FA II. 

else  FAIL 
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4.4.4  Table  of  other  results 


The  following  table  summarizes  the  results  of  the  remaining  experiments.  Six  numbers 
arc  associated  with  each  experiment.  These  quantities  are: 


(1)  P.  This  is  the  worst  case  running  time  of  pack  applied  to  inputs  of  the  form  under 
consideration. 


(2)  F.P.  The  performance  of  the  general  purpose  algorithm  pack  in  treating  special  cases 
where  the  bins  are  all  of  the  same  size  is  very  bad.  One  reason  for  this  is  that  no  use  is  made 
of  symmetries  introduced  by  the  equal  sizes  of  the  bins;  each  of  various  packings  which  are 
equivalent  under  renaming  of  bins  is  considered  separately.  It  was  of  interest  to  compare  the 
pcrformace  of  our  optimized  special  purpose  algorithms  with  the  performance  of  an  algorithm 
with  the  same  design  as  pack,  but  which  Lakes  the  symmetries  introduced  by  equal  bin  sizes 
into  account.  That  algorithm  is  as  follows: 


epack(X,s,k)  *-  epackl(X,<<>,l,s,k) 


epackl(X,B,n,s,k)  «■ 


if  n<lnth(B)  then 
if  X:l<B:n  then 
{A  z. 

(if  z 5* FAIL  then  n  z 
else  cpackl(X,B,n+l,s,k))} 
(cpackl(tl(X),scl(B,n,B:(n  -  X:l)),l,s,k) 
else  cpaekl(X,B,n  +  l,s,k) 
else 

if  k>l  A  (X:l£s)  then 
{A  z.  if  z*FAll.  then  (lnth(ll)+l)  @  z 

else  FAIL}  (cpack!(tl(X),B*«X:l>,l.s,k-l)) 

else  FAIL 


The  algorithm  cpack(X,s,k)  searches  for  a  packing  of  the  blocks  in  the  list  X  into  k  bins 
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each  of  size  s.  The  subprogram  epack  l(X,B,n,s,k)  searches  for  a  packing  of  the  blocks  X  into 
a  collection  of  bins  described  by  the  inputs  11, s,  and  k.  TTie  initial  elements  of  this  collection 
are  just  the  bins  whose  sizes  are  given  in  B,  while  the  remainder  of  the  collection  consists  of  k 
bins  each  of  size  s.  As  in  packb,  the  first  block  X:1  must  be  placed  in  a  bin  whose  index  is  at 
least  n.  The  behavior  of  epackl  resembles  that  of  packb,  except  that  it  keeps  track  of  which 
bins  are  still  empty.  A  block  is  placed  in  an  empty  bin  only  if  the  attempt  to  place  it  in  a 
non-empty  bin  leads  to  failure.  In  contrast  to  packb,  epackl  attempts  at  most  one  placemen, 
of  any  block  into  an  empty  bin.  The  term  "B*<X:1>"  in  epackl  denotes  the  result  of 
appending  the  list  "<X:1>"  onto  the  end  of  the  list  B. 

The  number  EP  represents  the  worst  case  running  time  of  epack. 

Note  that,  even  if  it  had  turned  out  that  the  "hand-optimized"  algorithm  was  more 
efficient  than  the  specialized  algorithms  which  we  produce  by  automatic  methods,  it  would 
not  follow  that  the  automatic  methods  are  not  of  use.  An  automatic  specialization  method 
such  as  the  one  currently  under  discussion  starts  with  a  general  algorithm  and  with  a 
description  of  the  special  form  of  the  inputs  to  be  considered;  the  output  of  the  method  is 
then  a  specialized  algorithm  which  deals  with  inputs  of  that  special  form.  The  most  direct 
measure  of  the  effectiveness  of  the  specialization  method  is  given  by  a  comparison  of  the 
output  of  the  method  with  the  original  algorithm,  and  not  with  some  third  algorithm  (such  as 
epack)  produced  by  a  person  to  handle  inputs  of  the  special  form.  A  separate  matter  of 
interest  is  to  compare  human  and  automatic  performance  in  this  regard  as  we  arc  doing  at  the 
moment.  As  it  happens,  and  as  will  be  seen,  our  automatically  specialized  algorithms  arc  in 
fact  faster  than  the  algorithm  epack  given  above. 

(3)  D  is  the  depth  of  the  decision  tree  produced  by  applying  normalization  and  the 
simplex  transformations  to  pack(<il,  .  .  .  in$>,«n,n,  .  .  ,n>).  Equivalently,  D  is  the  number 
of  comparisons  made  along  the  longest  path  down  the  decision  tree;  that  is  to  say,  the 
"running  time"  of  the  decision  tree. 

(4)  Dp  is  the  depth  of  the  decision  tree  produced  by  applying  pruning  to  the  tree  of  (3) 
immediately  above. 

(5)  S  is  the  size  of  the  decision  tree  of  (3)  measured  as  the  number  of  decision  points; 
equivalently,  S  is  the  number  of  occurences  of  "I  .TED"  in  the  p-term. 

(6)  Sp  is  the  size  of  the  pruned  decision  tree  of  (4). 
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In  the  table,  the  above  quantities  arc  arrayed  in  the  form: 


P 

EP 

D  S 
Dp  Sp 


The  effectiveness  of  pruning  is  indicated  by  the  differences  between  D  and  Dp,  and  between 
S  and  Sp.  The  table  of  results  is  as  follows.  Occurences  of  in  the  table  indicate  that  the 
relevant  decision  trees  could  not  be  constructed  because  of  lack  of  memory  space. 


4.S  Summary 


The  results  of  the  experiment  show  that  prunable  redundancies  can  indeed  arise  in  the 
specialization  of  a  simple  combinatorial  algorithm,  and  consequently  that  pruning  can  be  of 
use  in  specialization.  It  is  of  course  possible  that  equally  good  specialized  algorithms  for  the 
particular  problem  treated  -  namely,  bin-packing  -  could  have  been  arrived  at  by  a  head  on 
attack.  For  example,  one  such  attack  would  involve  manipulating  the  propositional  formulas 
which  result  from  unwinding  the  definition  of  a  legal  packing  as  applied  to  inputs  of  restricted 
size.  However,  as  has  been  remarked  earlier,  the  methods  by  which  the  specialization  was 
done  arc  for  the  most  part  completely  general  in  their  applicability;  the  only  special  property 
of  die  bin-packing  problem  which  was  used  was  the  decidability  of  linear  inequalities.  The 
machinery  of  normalization,  and  pruning,  and  proof  replacement  may  be  applied  to  any  proof 
whatever.  The  experiments  should  be  seen  as  a  first  test  of  the  utility  of  this  general 
machinery.  Our  purpose  was  not  to  develop  fast  special  purpose  bin-packing  algorithms,  but 
to  investigate  pruning  in  a  setting  where  its  effects  could  be  easily  isolated. 


I  *1 

I 


Chapter  5 


Other  Applications 


Until  now,  wc  have  restricted  attention  to  the  use  of  proof  manipulation  in  specializing 
algorithms.  The  purpose  of  this  chapter  is  to  briefly  indicate  other  computational  applications 
of  the  proof  manipulation  technology  which  has  been  described  in  the  course  of  this  thesis, 
and  at  the  same  time  to  outline  some  connections  between  our  work  and  other  traditions  of 
work  within  computer  science.  Applications  to  two  kinds  of  computational  problems  other 
than  specialization  will  be  considered,  namely,  applications  to  the  automatic  construction  of 
proofs  (from  proof  fragments;  section  5.1),  and  to  the  analysis  of  change  (section  5.2). 


5.1  Automatic  constmction  of  proofs 

As  emphasized  in  the  introduction  to  this  thesis,  most  work  in  computer  science  to  do 
with  formal  proof  systems  has  concerned  the  automatic  construction  of  proofs,  and  not  their 
manipulation.  Generally  speaking,  the  aim  of  such  work  has  been  to  provide  automatic  means 
for  determining  the  truth  values  of  propositions;  a  proof  of  a  proposition  is  constructed  in 
order  to  determine  that  it  is  valid.  Automatic  proof  construction  (or  "automatic  deduction”) 
in  its  most  pure  and  ambitious  form  involves  starting  with  an  arbitrary  formula  of  an 
expressive  language  (eg  the  predicate  calculus)  as  the  only  input  data;  the  output  is  cither  an 
indication  that  a  proof  has  been  found,  or  an  indication  of  failure.  Other  forms  of  automatic 
deduction  make  use  of  additional  input  data  beyond  the  formula  to  be  proved;  for  example 
sets  of  of  "rules"  for  backward  chaining  (Shortliffc  1974],  or  sets  of  programs  which  indicate 
in  explicit  algorithmic  terms  how  certain  problems  arc  to  be  reduced  to  subproblcms  [Hewitt 
1971].  It  is  traditional  within  artificial  intelligence  to  refer  to  this  additional  input  data  as 
"knowledge". 

Normalization  constitutes,  in  a  certain  sense,  a  method  for  automatically  constructing 
proofs;  a  normal  proof  of  a  proposition  is  automatically  constmctcd  from  an  arbitrary  proof 
of  that  same  proposition.  In  this  case,  the  "additional  input  data"  in  the  sense  of  the  last 
paragraph  consists  of  the  original  proof.  From  the  point  of  view  of  automatic  deduction, 
normalization  is  of  no  use,  since  the  additional  input  data  with  which  it  starts  is  already 
satisfactory  evidence  for  the  truth  of  the  proposition  in  question.  However,  by  liberalizing 
the  requirements  which  apply  to  proof  procedures  and  lemmas  (section  2.5)  it  is  possible  to 
use  the  machinery  of  normalization  as  developed  in  chapters  2  and  3  for  constructing  a 
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normal  and  complete  proof  of  a  proposition  starting  from  an  incomplete  proof  of  the  same 
proposition.  Under  these  conditions,  the  normalization  of  the  incomplete  proof  includes  a 
search  for  evidence  for  propositions,  and  thus  constitutes  a  form  of  automatic  deduction  in  the 
traditional  sense. 

Specifically,  let  us  drop  the  following  requirements  concerning  lemmas:  (1)  the 
requirement  that  lemmas  must  be  taie  formulas,  (?'»  the  requirement  that  lemmas  may  not 
appear  in  the  proofs  constructed  by  proof  procedures,  and  (3)  the  requirement  that  a  proof 
procedure  may  not  return  "FAIL"  when  applied  to  closed  arguments.  Also,  we  will  now 
allow  proof  procedures  to  produce  proofs  which  make  use  of  assumptions  which  are  active  at 
the  point  where  a  lemma  appears.  (The  notion  of  an  active  assumption  is  defined  in  section 
4.4.2.)  We  retain  the  requirement  that  all  axioms  be  true.  As  a  result  of  the  removal  of 
requirements  1  -  3,  it  is  now  possible  to  construct  incomplete  "proofs"  of  incorrect  formulas  - 
proofs  which  proceed  from  false  lemmas  to  false  conclusions.  However,  the  main  point  here 
is  that  the  process  of  normalization  -  exactly  as  described  in  chapters  2  and  3  -  may  have  the 
effect  of  removing  appearances  of  lemmas  -  thus  converting  an  incomplete  proof  of  a  formula 
whose  truth  is  in  doubt  into  a  complete  and  reliable  proof  of  that  same  formula. 

If  normalization  is  implemented  in  a  call-by-vaiuc  manner  as  described  in  section  4.1, 
then  the  normalization  of  an  incomplete  proof  corresponds  in  a  direct  way  to  proof  search  by 
backward-chaining  through  implications  -  in  other  words  to  "s  bgoaling".  Specifically,  in  the 
course  of  normalizing  a  proof  n:<p  containing  lemmas  l.,:Vx'l'1(x),  I.2:Vx'l'2(x)  .  .  . 
Ln:Vx'l,n(x),  the  proof  procedures  for  some  or  all  of  the  lemmas  arc  invoked.  (The  invocation 
of  a  proof  procedure  corresponds  roughly  to  an  attempt  to  "match"  a  subgoal.)  When  the 
proof  procedure  for,  say,  l.j,  is  called  with  input  t,  the  procedure  will  cither  fail 
(corresponding  the  failure  of  a  subgoal  in  backward  chaining),  or  return  a  proof  ns  of  <p(t). 
In  the  latter  case,  n(  is  then  normalized.  Since  n ;  may  itself  contain  lemmas,  the 
normalization  of  Ilj  will  in  general  involve  further  backchaining.  If  the  end  result  of 
normalization  is  a  proof  in  which  no  lemmas  any  longer  appear,  then  the  cndformula  <p  has 
been  "proved";  this  corresponds  to  a  successful  search  for  a  proof  by  backward  chaining.  (In 
particular,  this  corresponds  to  backward-chaining  without  backtracking;  however,  the  addition 
of  backtracking  to  the  mechanism  of  normalization  is  a  straight-forward  matter.) 

Let  FI  be  an  incomplete  proof  of  a  universal  formula  Vx<p(x).  Then  it  will  often  happen 
that  the  normalization  of  n  fails  to  yield  a  complete  proof  of  Vxqp(x),  but  at  the  same  time, 
normalization  of  the  proof 

n 

Vx<p(x) 

VK - 

<p(t) 
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for  a  particular  term  t  docs  yield  a  complete  proof.  This  can  come  about  in  the  following 
way.  'Hie  normalization  of  fl(t)  will  in  general  select  a  smaller  and  more  specialized  set  of 
"subgoals"  (that  is,  lemmas  for  which  proof  procedures  arc  invoked)  than  the  normalization  of 
IT;  in  theorem  proving  language,  the  normalization  of  IT  determines  the  particular  set  of 
subgoals  needed  to  verify  each  instance  <p(t)  of  the  general  formula  Vx<p(x)  -  different  sets  of 
subgoals  will  be  generated  for  different  instances.  The  subgoals  generated  by  normalizing 
Il(t)  may  be  satisfyable  even  though  those  generated  by  normalizing  n  are  not.  In  this  case, 
IT  docs  not  provide  evidence  for  the  truth  of  the  general  statement  Vx<p(x)  (indeed,  Vx«p(x) 
may  not  be  true),  but  does  indicate  a  method  for  attempting  to  construct  evidence  for 
instances  qp(t)  of  the  general  statement. 

In  the  case  where  <p  is  existential,  that  is,  where  <p(x)  =  3y\p(x,y)  for  some  *i>,  a  successful 
normalization  of 

n 

Vx3y^(x,y) 

VE - 

3y^(x,t) 

yields  a  value  for  y;  thus  II  describes  an  algorithm  for  computing  a  partial  function  satisfying 
the  specification  >P-  The  computation  in  question  involves  a  mixture  of  ordinary  computation 
(normalization),  and  proof  search  by  backward  chaining.  In  this  respect,  normalization  of 
partial  proofs  resembles  the  behavior  of  "pattern  matching  languages"  such  as  Planncr[Hcwitt 
1971]  and  its  successors,  where  ordinary  computation  is  mingled  with  subgoaling.  Moic  will  be 
said  about  this  resemblance  later. 

The  correspondence  between  normalization  and  familiar  kinds  of  backward  chaining  is 
enhanced  if  the  proof  procedures  for  lemmas  proceed  by  searching  for  a  "match"  between  the 
lemma  to  be  proved  and  the  cndformulas  of  proofs  in  a  pre-existing  data  base.  For  example, 
suppose  that  one  starts  with  a  data  base  { n  L,  .  .  .  fln}  of  incomplete  proofs  of  universal 
formulas.  Suppose  further  that  all  lemmas  which  appear  in  proofs  of  the  data  base  arc  V3 
formulas.  Finally,  suppose  that  the  following  uniform  proof  procedure  is  supplied  for  all 
lemmas:  the  procedure,  when  given  inputt  for  a  lemma  L:Vx3y</'(x,y),  scans  the  data  base 

{ n t . nn}  looking  for  a  proof  rij:Vz(p(z)  such  that  the  formulas  '/'(by)  and  <p(z)  unify  in 

the  sense  that  there  is  vector  of  terms  r0,r,,r2  .  .  .  rk  with  »^(t,r0)  =  <p(rj,r2  .  .  .  rk).  If  such  a 
proof  fl|  is  found,  then  the  procedure  returns  the  proof 

"i 

Vztjp(z) 

VE - - - 

<p(r,,r2 . . .  rk) 

31 - 

3y>Kt,y) 
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If  no  such  proof  can  be  found,  the  procedure  returns  "FAIL".  The  similarities  to  Planner  and 
its  successors,  and  also  to  the  "logic  programming  language”  PROLOG  [Kowalski  1974], 
should  now  be  evident.  In  particular,  the  behaviour  of  both  MicroPlanncr,  and  of  PROLOG 
programs,  can  be  closely  matched  by  the  machinery  just  described.  The  proofs  {n4} 
correspond  to  consequent  theorems  of  MicroPlanncr,  and  to  the  horn  clauses  of  PROLOG, 
llie  value  returned  by  a  successful  execution  of  a  PROLOG  program  corresponds  to  the 
realization  wlich  may  be  extracted  from  a  normal  proof  of  an  existential  theorem. 

As  has  been  convincingly  demonstrated  by  work  with  PROLOG,  a  person  who  knows  in 
general  terms  how  backward-chaining  works  is  in  practice  able  to  express  an  arbitrary 
algorithm  as  a  set  of  implicational  formulas;  the  execution  of  the  algorithm  takes  place  when  a 
backward-chaining  theorem  prover  (eg,  the  PROLOG  interpreter)  is  given  those  formulas  as 
axioms,  and  a  goal  which  encodes  the  input  to  the  computation.  (One  also  needs  a 
mechanism  for  extracting  an  output  value  from  a  proof;  in  PROLOG,  this  output  is 
constructed  in  the  course  of  the  search  for  the  proof.)  It  is  of  course  essential  that  sets  of 
implications  be  constructed  with  an  algorithm  explicitly  in  mind;  a  set  of  implicational 
formulas  which  arc  chosen  solely  according  to  the  criteria  of  Tarskian  truth  and  completeness 
arc  exceedingly  unlikely  to  be  of  any  computational  use,  regardless  of  the  theorem  prover 
used.  (This  is  analogous  to  the  observation  that  a  proof  of  an  V3  theorem  which  is 
constructed  solcy  according  to  "mathematical"  criteria  such  as  validity  and  elegance  is  unlikely 
to  be  of  much  computational  use  when  executed  by  normalization.)  Kowalski[1974]  has 
discussed  the  advantages  of  describing  algorithms  by  sets  of  formulas  and  executing  them  by 
use  of  a  backward-chaining  theorem  prover.  As  we  have  shown,  it  is  possible  to  mix 
normalization  with  backward  chaining;  presumably,  this  should  allow  the  benefits  of  the  two 
forms  of  computation  to  be  realized  simultaneously. 

We  remark  on  two  additional  aspects  of  automatic  proof  construction  using  normalization: 

(1)  Note  that  what  Stalhman  and  Sussman[l977]  have  called  dependency  directed 
backtracking  "comes  for  free"  in  normalization  with  paining.  Suppose  that  one  wishes  to 
normalize  a  proof 


whose  main  inference  is  V -elimination.  Suppose  further  that  C  is  a  Harrop  formula,  and  that 
normalization  of  11 1  does  not  decide  between  A  and  11.  (Kvidcntly,  the  requirement  that  only 
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non-Harrop  axioms  appear  in  proofs  may  be  dropped  in  the  case  where  the  endformula  is  a 
Harrop  formula;  consequently  the  normal  form  of  II  j  may  fail  to  decide  between  A  and  B  - 
for  example,  might  consist  simply  of  the  axiom,  "AVB".)  Then,  in  the  usual  ease,  it  will 
be  necessary  to  normalize  both  n2  and  Fl^.  However,  if  the  normal  form  n2'  of  n2  docs  not 
make  use  of  the  assumption  A,  pruning  allows  us  to  produce  n2'  as  the  end  result,  and 
thereby  to  dispense  with  the  treatment  of  Hj.  Thus  dependency  information  can  be  used  to 
reduce  the  amount  of  search  or  "backtracking",  juct  as  it  does  in  the  various  systems  for 
"dependency  based  reasoning"  which  have  been  developed  by  workers  in  Artificial 
Intelligence  (sec  l.ondon[J978],  Doylc[1978],  Shrobc[1979]).  Also  note  that  in  normalization, 
dependency  directed  backtracking  docs  not  rely  on  "non-hicrarchical  contexts"  or  non¬ 
monotonic  inferences. 

(2)  A  complete  proof  of  a  formula  Vx3yqp(x,y)  provides  evidence  for  the  truth  of 
Vx3y<p(x,y),  and  in  addition  describes  a  method  for  computing  a  function  f  with  Vx<p(x,f(x)). 
As  a  consequence,  the  normalization  of  an  incomplete  proof  ll:Vx3y<p(x,y)  consitutcs  both  a 
search  for  evidence,  and  a  search  for  an  algorithm  with  certain  properties;  in  the  terminology 
of  computer  science,  normalization  can  serve  as  a  method  for  the  synthesis  of  complete 
programs  from  program  fragments.  (For  comparison  with  program  synthesis  for  PROLOG 
sec  [Clark  and  Sickcl,  1977]).  Notes:  (a)  If  normalization  is  implemented  as  a  semi-automatic 
procedure  -  a  procedure  in  which  a  human  user  has  the  option  of  interactively  constructing 
proofs  of  lemmas  -  then  we  arrive  at  a  "refinement"  method  for  constructing  programs  very 
much  like  that  developed  by  Batcs]1979J.  (b)  A  single  proof  transformation,  namely  pruning, 
can  have  the  effect  of  improving  the  efficiency  of  a  computation  at  "run-time"  (as  explained 
in  the  last  paragragh),  or  of  optimizing  an  algorithm,  depending  on  whether  the 
transformation  is  applied  in  the  course  of  computing  a  value,  or  to  a  proof  of  an  V3  formula. 

(t  is  also  worth  considering  the  case  where  normalization  of  Vx3y<p(x,y)  produces  a  proof 
n'  which  is  not  complete.  Here,  IT  may  still  be  used  to  compute  values  of  y  with  <p(x,y) 
from  values  of  x  in  the  manner  described  earlier;  the  computation  will  not  consist  of  "pure" 
normalization,  but  will  involve  backward-chaining  through  lemmas  as  well.  What,  then,  is  the 
significance  of  the  passage  from  n  to  IT?  In  the  scheme  for  the  execution  of  incomplete 
proofs  with  which  wc  arc  currently  concerned,  the  burden  of  computation  is  shared  between 
automatic  deduction  (perhaps  in  the  form  of  "matching”),  and  pure  normalization.  When 
ri:Vx3y<p(x.y)  is  executed,  all  computation  (including  both  pure  normalization  and  automatic 
deduction)  which  is  possible  in  the  absccncc  of  a  concrete  value  for  x  is  carried  out.  When  a 
concrete  value  for  x  is  supplied,  the  remainder  of  the  computation  is  performed.  Thus  the 
passage  from  FI  to  II'  constitutes  a  kind  of  optimization;  all  work  which  can  be  done  without 
knowing  the  value  of  x  is  carried  out  first,  and,  as  a  consequence,  this  work  docs  not  have  to 
be  repeated  each  time  IT  is  run. 


5.2  Analysis  of  change 


Consider  a  situation  in  which  one  is  obliged  to  solve  a  series  of  problems  P(,  P2,  .  .  .  Pn, 
where  P.  (  is  only  "slightly  different"  from  Pj.  Then  it  may  happen  that  the  same  solution 
works  for  many  consecutive  problems.  It  is  useful  in  this  situation  to  determine  conditions 
under  which  a  small  change  in  a  problem  leaves  the  correctness  of  a  solution  intact;  if  the 
difficulty  of  evaluating  such  conditions  is  small  compared  to  the  effort  involved  in 
constructing  a  new  solution,  then  the  total  effort  needed  for  solving  Pj,  P2,  ■  •  •  Pn  can  be 
reduced. 

In  Artificial  Intelligence,  the  task  of  determing  the  effects  of  small  changes  is  referred  to 
as  the  "frame  problem"  [McCarthy  1969].  The  use  of  proofs  as  descriptions  of  algorithms  can 
provide  aid  in  attacking  the  frame  problem,  in  the  following  way.  Suppose  that  when  a 
problem  P  is  solved,  one  constructs  not  only  a  solution  S,  but  also  a  proof  II  that  S  really  is  a 
solution  of  P.  Then  11  provides  an  explicit  description  of  the  features  of  the  problem  upon 
which  the  success  of  S  depends.  If  P  is  changed  slightly,  one  is  able  to  see,  by  inspecting  the 
proof  FI,  whether  any  feature  relevant  to  the  success  of  S  has  been  modified.  Now,  if  one 
uses  a  proof  to  describe  a  method  for  solving  a  problem,  then  the  execution  (ic  normalization) 
of  the  proof  when  applied  to  a  particular  problem  yields  not  only  a  solution,  but  also  a 
specialized  proof  that  the  solution  is  correct;  and,  as  we  have  said,  this  proof  can  be  used  in 
the  analysis  of  change. 

This  idea  is  illustrated  by  the  following  schematic  example.  Consider  the  problem  of  of 
computing  an  output  value  v  with  <p(t,v)  when  given  a  vector  t  =  t(,  .  .  .  tn  of  inputs. 
Suppose  that  an  algorithm  for  doing  the  computation  is  given  by  a  proof  11  of  Vx3y<p(x,y) 
and  that  the  result  of  executing  Il(t)  is  a  proof  IT  of  3y<jp(t,y)  which  provides  v  as  a  value  for 
y.  In  the  general  ease,  [T  will  make  use  of  properties  of  some  but  not  all  of  the  inputs 
tj,  .  .  tn.  Suppose  then  that  a  "slightly  different  problem"  is  presented  -  namely  the 
problem  of  computing  v'  with  <p(l',v'),  where  the  vector  t’  differs  from  t  in  only  a  few  entries. 
If  the  entries  in  which  t'  differs  from  t  do  not  include  any  of  the  entries  whose  properties  arc 
mentioned  by  FI',  then  <p(t',v)  holds,  and  the  computation  docs  not  need  to  be  repeated. 

The  same  kind  of  analysis  of  change  can  be  carried  out  without  using  proofs.  Suppose 
that,  in  the  above  schematic  example,  the  computation  of  v  from  t  is  carried  out  by  the 
execution  of  an  ordinary  program  p(x(,  .  .  .  xn)  rather  than  by  the  normalization  of  a  proof. 
Then  a  trace  of  the  execution  of  p(tt,  .  .  t„)  will  indicate  which  among  the  values  t(,  .  .  .  tn 
have  been  used  in  the  computation  and  which  have  not.  thus  providing  the  same  kind  of 
dependency  data  as  is  supplied  by  the  normal  proof  ll':3y<p(t,y).  However,  the  normal 


proof  n'  in  genera!  provides  a  more  thorough  and  more  useful  analysis  of  dependencies  that 
the  corresponding  program  trace.  To  sec  how  this  can  come  about,  compare  the  execution  of 
a  conditional  expression 


if  rj  then  r2  else  r} 

with  the  normalization  of  the  corresponding  proof: 

[A]  IB] 

ni  n2  n3 

AVB  C  C 

VK - 

C 

Suppose  that  (1)  r(  evaluates  to  "TRUK”,  (2)  the  normal  form  of  fl2  docs  not  contain  the 
assumption  A,  and  (3)  an  input  tj  appears  in  T|  ''and  in  flj)  but  not  in  r2  (nor  in  fl2).  Then, 
in  a  trace  of  the  execution  of  "if  r(  then  r2  else  r,",  the  outcome  will  appear  to  depend  on  tj, 
but  the  corresponding  normal  proof  will  reveal  that  the  correctness  of  the  outcome  is 
independent  of  tj. 

Thus,  in  the  analysis  of  change  as  in  the  specialization  of  algorithms,  proofs  provide 
additional  data  about  the  dependencies  between  facts  involved  in  a  computation,  and  this 
additional  data  can  be  exploited  to  avoid  redundant  computation. 

Analysis  of  change  of  the  kind  which  wc  have  been  discussing  -  based  however  on  the 
use  of  programs,  and  not  proofs,  as  descriptions  of  algorithms  -  has  been  used  in  a  number  of 
settings  within  computer  science.  To  take  a  simple  example,  the  conventional  program 
optimization  which  is  known  as  code  motion  [Aho  and  Ullman  1973]  involves  analysis  of 
change  in  the  context  of  iterative  computation.  In  the  typical  kind  of  code  motion,  an 
assignment  statement  "v  t"  is  moved  out  of  an  inner  loop  when  it  is  determined  that  the 
variables  appearing  in  t  do  not  change  in  the  loop.  By  using  a  proof  for  describing  the 
computation  of  the  value  to  be  assigned  to  v,  this  analysis  of  change  might  be  improved  - 
specifically  by  determination  of  the  conditions  under  which  the  correctness  of  the  value 
computed  depends  on  variables  which  change  in  the  loop.  A  related  idea  is  worked  out  in  a 
paper  of  Katz[1978]  concerning  the  use  of  proofs  of  invariant  assertions  in  optimizing  iterative 
descriptions  of  computation. 

Other  examples  arc  provided  by  constraint  systems  such  as  those  developed  by  Stallman 
and  Sussman(1977],  and  Horning  [1979],  and  by  "dependency  based  reasoning  systems"  such 
as  those  of  Shrobc[l979],  l.ondon[1978|  and  Ooyle|1978|.  In  these  systems,  situations  -  such  as 
die  state  of  an  clecliical  circuit  [Stallman  and  Sussinan  1977]  -  are  represented  in  such  a  way 
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that  the  dependencies  among  the  facts  and  values  which  describe  the  situation  are  explicitly 
recorded.  When  the  situation  changes,  or  when  an  assumption  about  the  situation  is  added  or 
withdrawn  in  the  course  of  automatic  deduction,  the  dependency  information  is  used  to 
determine  what  aspects  of  the  situation  have  been  affected,  and  what  computation  has  to  be 
done  to  update  the  representation.  For  the  reasons  given  above,  the  use  of  proofs  as 
descriptions  of  algorithms  may  be  expected  to  improve  the  analysis  of  dependencies  upon 
which  these  systems  rely. 


9 


4 


Appendix  A 


Comparison  to  Extraction  Methods  from  Proof  Theory 

Traditional  proof  theory  provides  two  kinds  of  methods  for  the  execution  of  proofs.  First 
there  are  the  methods  which  operate  by  transformation  of  the  proofs  themselves.  ITic 
normalization  procedure  of  Prnwitz[1965]  described  in  chapter  2  belongs  to  this  class,  as  docs 
the  cut-elimination  procedure  of  Gentzen  [1969]  for  the  calculus  of  sequents.  Second,  there 
arc  methods  which  involve  extracting  "programs"  of  one  kind  or  another  from  proofs;  it  is 
then  the  program  which  is  executed,  and  not  the  proof  itself.  Kxamplcs  of  methods  of  the 
latter  kind  arc  the  recursive  realizability  interpretation  of  Klcenc[1945],  the  Dialectica 
interpretation  of  Gbdc![1958],  and  the  modified  realizability  interpretation  of  Krciscl[1959]  for 
analysis. 

The  normalization  method,  and  the  modifications  to  it  which  we  have  made  in  order  to 
increase  efficiency,  have  of  course  been  discussed  at  length  in  this  thesis.  The  purpose  of  this 
appendix  is  to  compare  the  methods  which  we  use  to  the  other  family  of  execution  methods 
from  proof  theory  -  namely,  the  functional  and  realizability  interpretations.  The  account 
which  follows  is  intended  for  the  reader  who  is  familiar  with  these  interpretations. 

In  general  terms,  the  situation  is  this.  The  "programs"  extracted  by  the  three 
interpretations  mentioned  above  arc  Gbdel  numbers  of  partial  recursive  functions  in  the  ease 
of  recursive  realizability,  and  typed  A -calculus  terms  in  the  other  two  cases.  As  shown  by 
M iutsl  1977],  the  various  programs  extracted  by  these  interpretations  from  a  proof  of 
Vx3y<p(x,y)  all  compute  the  same  function  as  does  normalization.  Furthermore,  the 
convertability  results  of  Mints,  and  the  commutativity  results  of  Dillcr  [1979]  show  that  it  is 
not  only  the  function  computed  which  remains  fixed  under  these  interpretations,  but  also  the 
form  of  the  computation  sequences  which  arise  when  the  function  is  applied  to  a  particular 
argument. 

The  programs  extracted  by  the  functional  and  realizability  interpretations  mentioned 
above  resemble  the  untyped  p-terms  which  we  extract  from  proofs  in  that  both  the  p-terms 
and  the  programs  contain  the  information  in  a  proof  which  is  relevant  to  execution  but  leave 
out  most  of  the  test  of  the  data  in  the  proof,  t  he  interpretations  differ  among  themselves  in 
the  efficiency  of  the  programs  which  they  extract,  but,  in  one  ease  -  namely  modified 
realizability  -  the  extracted  programs  arc  as  consisc  and  computationally  efficient  as  p-terms. 
The  Dialectica  interpretaion  also  produces  "good  code",  but  to  a  somewhat  lesser  extent.  In 
the  ease  of  recursive  realizability,  efficiency  depends  on  the  particular  godel  numbering  and 
interpreter  used. 


101 


For  our  purposes  the  differences  between  p-terms  and  programs  arc  crucial,  since  p- 
terms  contain  the  dependency  data  needed  for  pruning,  whereas  the  programs  do  not.  in 
order  to  specialize  algorithms  by  symbolic  execution  and  pruning  as  we  have  done  in  the  bin¬ 
packing  experiments  of  chapter  4,  we  need  a  form  of  computational  description  which  meets 
both  of  the  following  requirements:  (1)  Symbolic  execution  of  the  description  must  be 
tolerably  efficient.  (2)  The  dependency  data  needed  by  pruning  must  be  present  in  the 
description,  and  further,  this  dependency  data  must  be  preserved  in  the  course  of  symbolic 
execution.  Now,  normalization  as  described  by  Prawitz[1965)  meets  the  second  requirement 
but  not  the  first  whereas,  from  what  we  have  just  said,  the  programs  extracted  by  the 
functional  and  realizability  interpretations  meet  the  first  requirement  but  not  the  second. 
Thus  none  of  the  tools  from  traditional  proof  theory  is  adequate  for  performing  the  kinds  of 
manipulations  on  algorithms  which  have  been  the  central  concern  of  this  thesis,  and  for  this 
reason  it  was  necessary  to  use  a  new  fonn  of  computational  description  -  the  p-term. 

For  a  more  explicit  formulation  of  the  relationship  between  p-terms  and  the  programs 
extracted  by  the  interpretations,  we  will  need  the  following  notation.  Let  y.  be  the  procedure 
which  extracts  untyped  p-terms  from  proofs,  and  let  y?  be  the  extraction  procedure  for  any 
one  of  the  interpretations.  The  modified  realizability  and  Dialcctica  interpretations  extract 
typed  A-calculus  terms  from  proofs;  however,  it  is  convenient  here  to  regard  the  terms 
extracted  by  y2  as  terms  of  the  ordinary  untyped  A-calculus.  This  is  an  inessential 
modification,  since  the  type  information  contained  by  A-terms  is  not  needed  for  normalization 
and  cannot  help  in  pruning.  With  this  taken  into  account,  there  is  a  procedure  y3  for 
extracting  programs  from  / rterms  such  that  the  diagram, 


commutes. 

Thus,  p-terms  lie  "on  the  way"  from  proofs  to  programs.  Furthermore,  the  map  y3  is 
many-to-onc:  there  is  no  way  of  getting  the  p-term  back  from  the  program  extracted  from  it. 


In  part  («)  of  section  A.l,  we  will  describe  y3  in  general  terms  for  the  modified 
realizability  interpretation,  and  show  in  part  ( (i )  that  pruning  cannot  be  used  in  connection 


with  programs  produced  by  this  interpretation.  The  treatment  of  recursive  realizability  is 
essentially  identical.  In  section  A.2.  the  Dialcctica  interpretation  is  discussed.  I’hc  example 
which  shows  that  pruning  docs  not  apply  for  modified  realizability  interpretation  or  for 
recursive  realizability  also  works  for  the  Dialectica  interpretation. 

For  the  current  purposes,  it  is  convenient  to  restrict  our  attention  to  a  theory  which, 
roughly  speaking,  represents  the  intersection  of  the  theories  treated  by  the  various 
interpretations  -  namely,  the  formulation  of  arithmetic  given  in  section  3.6.  The  language  of 
this  theory  is  just  the  standard  language  of  arithmetic;  the  set  of  available  lemmas  consists  of 
the  induction  schema  INl)^.  We  have  not  said  what  axioms  arc  used,  and  we  don’t  need 
to.  since  the  choice  of  axioms  makes  no  difference  to  what  we  have  to  say.  (Note  that  the 
standard  system  for  intuitionistic  arithcinctic  arises  from  one  such  choice  of  axioms.) 


A. I  Me  ified  realizability  and  recursive  realizability 

(a)  f  irst  we  describe  the  map  y5  which  takes  a  p-term  and  rewrites  it  as  a  modified 
realization.  What  y^  docs  is  to  replace  the  special  operators  Ol,,  OI2,  OH,  F.I,  KK  of  the  p- 
calculus  by  constructs  of  the  ordinary  A-calculus.  Specifically,  we  use  the  replacements; 

#  =>  c  (where  c  is  a  constant  symbol;  a  different  constant  symbol  is 

assigned  to  each  occurence  of 

01,(1)  =>  <0,t> 

OI2(l)  =>  <l.t> 

O»:(o.t|.t2.t,)  =>  if  »,((,)  then  t2|<»*-7r2(t,)l  else  tJ|a*-w2(l,)| 

l  l(t,.t2)  =>  <t,,t2> 

i:i;(x.a,t|,t2)  =>  t2[x-w,(t,),a*-w2(tl)l 

INIXx.l)  =>  {R(W,(t).  Ay  /.({*2(lXy)}«0./>)))}  (x) 

In  the  above,  R  is  a  conventional  recursion  operator,  to  which  the  reduction  rules 
R(t,.t2Xsucc  x)  =*•  t2(R(l|.l2Xx))(x) 

R(t,.t2X0)  =>  t, 
apply. 
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The  conditional  operator  "if  t,  then  t^  else  t3"  is  assumed  to  take  the  numeral  0  as 
TRUK,  and  the  numeral  1  as  FAl.SK.  'ITie  conditional  operator  can  of  course  be  defined  by 
"if  t,  then  tj  else  t3  =  R(t2,Xx  y.  t})".  Pairing  can  also  be  defined  in  the  typed  X-catcutus 
over  arithmetic,  but  since  the  types  of  terms  are  not  available  in  the  current  context,  we  take 
pairing  and  projection  as  primitive.  (  There  is  no  definable  operator  in  the  untyped  X-calculus 
which  has  the  characteristics  of  a  pairing  operator,  as  shown  by  Barendrcgt(I972].) 

lbcsc  replacements  preserve  the  behavior  of  terms  under  normalization,  as  shown  by  the 
propositions  (I)  •  (4)  below. 

(1)  If  tj  reduces  to  t2  by  the  application  of  a  single  reduction  rule  of  the  p-calculus,  then 
y,(t2)  reduces  to  y  j(t2)  by  the  application  of  a  single  reduction  rule  of  the  X-calculus. 

(2)  If  t  is  in  normal  form  (for  the  p-calculus).  then  y}(t)  is  also  in  normal  form  (for  the 
X-calculus). 


(3)  Let  t  be  a  p-term  which  has  been  extracted  from  a  proof  (of  arithmetic).  Then  t  and 
Yj(t)  both  have  the  uniqueness  property.  Ily  (1),  (2)  immediately  above,  we  have  |y}(t)|  = 
Yj(|t|).  where  |t|  designates  the  normal  form  of  t. 


(4)  I  ct  t  be  a  p-term  which  has  been  extracted  from  a  proof  in  arithmetic  all  of  whose 
axioms  arc  true.  Then  there  is  an  assignment  of  types  to  the  variables  and  constants  of  y}(t) 
such  tli.u  the  resulting  typed  X-calculus  term  reali/es  the  cndformula  of  the  proof  in  the  sense 
of  modified  realizability. 

Thus,  from  the  point  of  view  of  execution  as  opposed  to  pruning,  there  is  not  much 
difference  between  the  term  extracted  from  a  proof  for  modified  realizability,  and  the  p-term 
which  we  extract  from  proofs. 

( ff )  However.  y(  destroys  the  dependency  information  which  is  needed  by  pruning.  The 
problem  is  that  in  replacing  "OI;.(a,t,.t1.l,)"  by  "if  vr,(t,)  then  t2l«x *“ ■rr^t j)J  else  t}[««-w2(t|)l”, 
one  looses  track  of  the  use,  if  any,  which  is  made  in  t2  and  tj  of  the  assumption  represented 
by  a. 


In  what  follows,  we  demonstrate  this  point  in  a  formal  way  by  exhibiting  a  pair  of  proofs 
II  and  II'  such  that  (a)  Yj(Y|(H))= y^y^f!  )).  and  (b)  pruning  can  be  applied  to  fl,  but  not 
to  IV.  Thus  the  information  which  distinguishes  between  proofs  which  can  be  pruned  and 
those  which  cannot  is  lost  by  yy  A  fortiori ,  the  data  needed  to  determine  the  outcome  of 
pruning  operations  is  not  p  csent  in  the  ordinary  X-terms  produced  by  y^. 
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In  order  to  construct  n.  n\  we  first  need  proofs  fljiAVB,  n2:A'VB,  n3:^(0),  n}':^(0), 
n4:^(0),  n5:*(l)  such  that  (a)  A  and  A'  are  distinct,  (b)  Y1(ni)=y1(ri2)  (c)  the  set  of  open 
assumptions  of  ll3  is  {A'},  (d)  the  set  of  open  assumptions  of  fl3'  is  {A.A'}.  (c)  y,(n3)  = 
*■  £1)  where  a  is  die  proof  variable  assigned  to  the  assumption  A,  and  p  the 
proof  variable  assigned  to  the  assumption  A',  (f)  the  set  of  open  assumptions  of  fl4  is  {B},  (g) 
the  set  of  open  assumptions  of  flj  is  {B}.  It  is  not  difficult  to  construct  proofs  with  these 
properties.  For  example,  we  can  take  A  =  <pV(0- 1),  A'  =  g>V(l  =  2),  and  (using  the  q- 
notation  explained  in  section  4.2), 

rij  =  "OE(n0:<pVB, 

OI(OI(AS(<p),0=  1),B), 
OI(<pV(0=l).AS(B)))M. 

n2  =  "OFXno:«jjVB, 

OI(OI(AS(<p),l  =  2),B), 

OI(9V(l  =  2),AS(B)))". 


where  nQ  is  any  proof  of  <pVB.  Then,  as  desired,  y^Ilj)  =  Yl(n2)  = 
OI;.(o,yl(lI0).()l1(OI,(a)).OI2(tt)).  By  the  requirement  (c)  above,  n}  and  n3'  must  be 
identical  in  form  except  that  uses  of  the  assumption  A  in  n3'  arc  replaced  by  uses  of  the 
assumption  A'in  Tlj.  This  can  be  achieved  by  a  trick  similar  to  the  one  used  for  n^Ilj 
above. 
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Now,  we  take 


n  = 


"2  "3  "4 

A  VB  HO)  HO) 

VK - 

HO) 

nt  3i - 

AVB  3x^(x) 

VB - 

3x*(x) 


ns 

HD 


3  x*(x) 


IV  is  just  tiic  same  as  FI.  except  that  n3'  appears  in  the  place  of  fly  The  p-term  notation 
for  II  is: 


t  =  OH(a.l|,HI(0,OB(j3,t|.tj,t4)).HI(l,ts)) 

where  tt  =  y ,(II ,)= y ,(n2),  t3  =Y,(n3).  t4  =  Y|(I14).  and  ts  =  y,(II5).  The  p-term  notation 
for  n'  is 

t'  =  OB(a.t1,i:i(O.OIX/3.t1.t;,t4)),BI(l.ts)) 

where  t3'  =  Y|(n3').  Ihc  only  difference  between  t  and  l'  is  that  t3  =  t,'|nr*"^)-  As  a 
consequence,  t  can  be  pruned  to  "BI(O.OI'(/?,t|,t3,t4))",  whereas  t'  cannot.  However,  the 
difference  between  t  and  t'  is  lost  in  the  course  of  translation  from  the  p-calculus  into  the  X- 
calculus;  in  passing  from  l}  to  y3(t3)  and  from  t3'  to  y}(t}'),  a  and  /?  arc  replaced  by  the  same 
term,  namely  w 2(y3(t |))-  Specifically, 


Yj(0  =  Yj(t’)  = 

if  w,(Y3(t())  then 

<0.  if  WjIyjUi))  then  Y»(tJ)|/i*-w2(YJ(l|))l  else  Y}<ta)l/i*'W2(Yj(t,))|> 
else  <l,Yj(t5)la*-w2(Yt(t|))l> 

As  desired,  this  example  demonstrates  that  pruning  cannot  be  applied  to  the  X-terms 
extracted  by  the  modified  realizability  interpretation.  The  same  example  works  in  the  same 
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way  for  the  recursive  realizability  interpretation.  The  only  difference  is  that  "if  then  else"  and 
the  pairing  operator  are  regarded  as  operators  not  on  \-tcrms  but  on  godcl  numbers  of  partial 
recursive  functions. 


A.2  The  Dialectics  interpretation 

The  Dialectics  interpretation  (Glide!  1958]  extracts  somewhat  more  information  from  a 
proof  than  cither  the  recursive  realizability  interpretation  or  the  modified  realizability 
interpretation,  consequently,  it  requires  separate  treatment.  In  part  (a),  we  describe  the  map 
y,  from  p-ierms  to  Dialectics  realizations,  and  in  part  (/?),  we  show  that  pruning  is  not 
applicable  to  the  terms  extracted  by  the  Dialectics  interpretation. 


(a)  For  each  formula  A  of  arithmetic,  the  predicate  "f  Dialectics  interprets  A"  is 
expressed  hy  a  formula  Vxl)A(f,x)  in  the  theory  of  functionals  of  finite  type  over  the  natural 
numbers,  where  D*  is  quantifier  free.  (In  the  standard  treatments  of  the  Dialcctica 
interpretation,  the  single  universal  variable  x  in  VxD^(f.x)  is  replaced  by  a  vector  of  variables 
x.  However,  it  is  convenient  for  our  purposes  to  use  a  single  universal  variable  which  may 
range  over  pairs.) 

The  difference  between  the  modified  realizability  interpretation  and  the  Dialcctica 
interpretation  may  be  summari/ed  as  follows.  In  the  modified  realizability  interpretation,  a 
functional  which  realizes  a  formula  ADIl  is  required  to  produce  a  realization  for  II  whenever 
it  is  supplied  with  a  realization  of  A.  In  the  Dialcctica  interpretation,  a  realization  for  ADI) 
must  provide  not  only  a  way  of  getting  from  realizations  of  A  to  realizations  of  I),  but  also, 

roughly  speaking,  a  way  of  getting  from  refutations  of  I)  to  refutations  of  A.  Specifically,  a 

Dialectic.!  realization  of  ADIl  is  a  pair  <X.Y>  of  functionals  such  dial 

Vfy.(DA(f.Y«r,y»)DDB(X(0,y)) 

holds.  The  functional  X  takes  realizations  of  A  onto  realizations  of  I),  just  as  the 
corresponding  functional  for  the  modified  realizability  interpretation  docs.  'Hie  role  of  the 
functional  Y  is  this.  Suppose  that  f  is  proposed  as  a  realization  for  A  but  that,  in  actuality,  f 
does  not  realize  A.  Also  suppose  that  a  functional  y  is  given  such  that  l)B(X(f),y)  docs  not 
hold.  I  lien  y  constitutes  a  refutation  of  the  proposition  that  X(f)  is  a  realization  of  I).  What 
Y  does  is  to  take  the  refutation  y.  and  the  functional  f.  and  produce  a  functional  Y(<f.y>) 

which  constitutes  a  refutation  of  the  proposition  that  f  is  a  realization  of  A. 

In  the  definition  given  below,  it  is  convenient  to  write  the  realization  predicate  DA  in  the 
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form  "Xf  x.<p"  ,  where  <p  is  a  quantifier  free  formula;  this  allows  us  to  explicitly  indicate 
occurences  of  the  variables  f,x  which  represent  the  arguments  to  the  predicate.  The  definition 
of  DA  by  induction  on  the  structure  of  A  is  as  follows. 

(1)  Base  ease.  For  A  atomic.  DA  =  Xf  x.A  (where  f  and  x  arc  new  variables  not  appearing 
in  A) 

(2)  DAAB  =  Xf  x.(nA(7r1(0.ir1(x))ADB(7r2(0,w2(x))) 

(3)  DAVB  =  Xf  *-((*i(0=0  D  DA(7r2(f),wt(x)))  A  (Wl(f)=0  D  Db(w2(0.w2(x)))) 

(4)  l)3y.A  =  Xf  x.(DA(W2(0,x)[y«-Wl(0]) 

(5)  DVy  A  =  Xf  x.(l)A(fl[w  ,(x)),7r 2(x))[y *■  it  t(x )j) 

(6)  DADB  =  Xf  x.  (Da(w,(x),w2(0(x))  D  nB(v1(f)(v,(x)).V2(x)) 

'Hie  map  for  the  Dialcctica  interpretation  yields  not  one  but  several  X-terms  when 
applied  to  a  p-term  t.  Namely,  it  produces  (1)  a  realization  X,  anu,  (2)  a  term  Ya  for  each 
proof  variable  a  which  appears  free  in  t.  The  term  X  is  a  Dialcctica  realization  for  the 
cndformula  of  the  proof  n  from  which  t  was  extracted,  while  for  each  o,  the  term  Yft 
computes  refutations  (in  the  sense  described  earlier)  of  the  open  assumption  of  n  which 
corresponds  to  the  proof  variable  a.  More  precisely,  if  a,.  ...  an  arc  the  proof  variables  for 
assumptions  A,  ...  An  in  a  proof  II  with  end  formula  C,  then  the  formula 

V«p  •  •  •  «„  *• 

0\<«l-Ya /*»  A  %(«*Ya2(*»  •  •  •  A  l\<«n-\M)  => 

holds  in  die  theory  of  functionals  of  finite  type  for  some  assignment  of  types  to  the  variables 
and  constants  of  the  Y„  and  of  X.  We  will  designate  the  realization  X  which  is  extracted 
from  a  term  t  by  "X(t)",  and  the  refutation  maps  YQ  by  "Ya(t)".  Some  of  the  clauses  of 
the  inductive  definition  of  the  extraction  map  yi  are  as  follows.  The  remaining  clauses  have  a 
similar  character;  the  interested  reader  should  have  no  trouble  working  them  out  for  himself. 

(1)  Base  ease 

X:  a  =>  «; 

Ya:  a  =>  Xx.x 


X:  ti  =>  c 


[where  c  is  a  new  constant  symbol) 


(2)  X:  OI(l,t) 


<0,X(t)> 


X:  01(2,t)  =>  <1.X(0> 

Ya:  01(1,0  =*  Xx.{(Ya(0Xwl(x»} 
Ya:  01(2,0  =>  Ax.{(Ya(0)(7r2(x))} 


(3)  XiO^a.tj.tj.tj))  =*  if  77 |(X(t|))  then  X(t2)[a-772(X(t1))]  else  X(t3)[«-772(X(t1))J 

Y^:OF(a,t1,t2,tj)  =>  if  tt^X^))  then  Xx.(Y/?(t1)(Ya(t2)(x»  else  Ax.(Y/S(t1)(Ya(t3Xx)) 

(if  appears  free  in  t3; 

Yo:OF(a,tl,t2,t3)  =>  Y p(t2)  (if  /?  appears  free  in  t2;  P*a) 

Yo:OH(a,t1,t2,t3)  =>  Yn(t3)  (if  /?  appears  free  in  t3;  P*a) 

(if  ft  appears  free  in  more  tlian  one  of  t)(  t2,  t3,  then  any  of  the  applicable  clauses 
for  Yp  may  be  used) 

(4)  X:\tt.t  =*  <Aa.X(t),Aa  x.Ya(772(x»> 

Yp.Xa.l  =>  Ax.(Y^(t)(772(x)))  (where  P*a) 

(5)  X:KI(tltt2)  =>  <t1,X(t2)> 

Ya:FI(t,,t2)  =>  Y^t,) 

((i)  Now,  to  show  that  pruning  is  not  applicable  to  the  terms  extracted  by  the  Dialcctica 
interpretations,  we  only  need  to  verify  that  the  p-tcrins  t  and  t'  of  the  last  section  yield  the 
same  term  when  given  to  y3.  This  is  straight-forward,  since,  as  we  have  seen,  the  Dialcctica 
interpretation  and  the  modified  realizability  interpretation  behave  in  the  same  way  so  long  as 
implication  ("D")  is  not  involved.  In  particular,  we  have, 

X(t)  =  X(t')  = 

if  77  3(X(tj))  then 

<0,  if  77j(X(t,))  then  X(t3)[/^772(X(tj))]  else  XO^^XO^ 

.  else  <l,X(t5)[a-772(X(tj))l> 


Appendix  B 


Content  and  Form  in  Proof  Manipulation  -  An  Example 

There  is  a  sharp  contrast  between  the  uses  which  we  have  made  of  proof  manipulation 
methods  and  the  aims  for  which  those  methods  were  originally  devised.  Namely, 
normalization,  and  its  predecessor,  cut  elimination,  were  developed  as  tools  for  use  in  the 
mathematical  analysis  of  proofs  and  provability,  whereas  we  have  used  them  here  for  the 
execution  and  transformation  of  algorithms.  With  this  shift  in  aims  comes  a  change  in  the 
features  of  proof  systems  which  arc  significant.  The  purpose  of  this  appendix  is  to  illustrate 
this  change  by  means  of  an  example.  In  particular,  it  will  be  shown  in  section  B.1  that  the 
complexity  of  the  theorems  which  can  be  expressed  and  proved  in  a  formal  system  -  if  you 
like,  the  "power"  or  "inferential  content"  of  the  system  -  is  not  correlated  with  the  complexity 
of  the  computations  which  its  proofs  can  describe.  Thus  a  central  feature  of  proof  systems 
from  the  point  of  view  of  most  of  proof  theory  is  demonstrably  unrelated  to  the  central 
feature  of  proof  systems  for  the  purposes  of  computation.  In  section  11.2,  the  analysis  given  in 
section  B.l  is  extended  to  normalization  with  pruning.  We  begin  with  a  brief  discussion  of 
the  aims  for  which  proof  transformations  were  developed. 

Most  work  in  proof  theory  has  addressed  itself  to  questions  which  are  formulated  in  terms 
of  provability  and  which  do  not  make  direct  reference  to  proofs  themselves  or  to  their 
properties.  Questions  and  results  of  this  kind  have  a  certain  generality  in  that  they  are 
independent  of  the  details  of  how  proofs  are  represented:  the  differences  between  the  familiar 
proof  systems  (such  as  natural  deduction,  the  calculus  of  sequents,  "Hilbert-siyle"  systems,  and 
so  forth)  are  immaterial  from  the  standpoint  of  provability  -  anything  that  can  be  proved  from 
given  axioms  in  one  system  can  also  be  proved  in  the  others.  Txamples  of  central  notions  in 
proof  theory  which  refer  only  to  provability  are  (1)  the  consistency  of  a  theory.  (2)  the  relative 
"power"  of  logical  principles,  and  (3)  the  "proof  theoretic  strength"  of  a  theory  as  measured 
by  its  ordinal.  Of  course,  the  study  of  questions  to  do  with  provability  often  requires 
investigation  of  the  details  of  particular  proof  systems.  Cut-elimination,  the  ancestor  of 
normalization,  was  developed  as  part  of  just  such  an  investigation;  namely  the  investigation 
which  led  (ient/en  to  his  consistency  proof  for  arithmetic  from  the  principle  of  (quantifier 
free)  induction  on  the  ordinal  e0. 

However,  formal  proofs  can  also  be  studied  .is  mathematical  objects  whose  properties  are 
of  interest  in  their  own  right.  Tor  example,  the  strong  normalization  theorem  for  natural 
deduction  [I’ravvitz.  1%‘)|  (see  chapter  3).  and  the  theorems  of  [Mints  1977)  about  the 
relationships  between  the  "programs"  extracted  by  the  various  realizability  interpretations  (sec 
appendix  A)  are  of  interest  primarily  for  the  theory  of  proofs  (as  objects  of  mathematical 
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study),  and  not  so  much  for  the  theory  of  provability.  These  results  have  the  common  effect 
of  showing  that  the  notion  of  the  computation  described  by  a  proof  is  relatively  stable  under 
changes  of  technical  formulation.  This  should  be  compared  to  the  stability  under  change  of 
formulation  of  die  notion  of  a  computable  function;  a  stability  which  constitutes  the  evidence 
for  Church’s  thesis. 

Krciscl[l°69],  and  Statman[1974]  have  emphasized  that  a  shift  in  attention  from  the 
theory  of  provability  to  the  theory  proofs  leads  to  a  change  in  the  selection  of  notions  and 
distinctions  which  arc  important.  As  we  have  said,  this  appendix  is  intended  to  make  a 
similar  point  in  regard  to  another  view  of  proofs  -  namely  the  view  of  proofs  as  computational 
descriptions.  The  example  to  be  given  shortly  illustrates  the  differences  between  the  aims  of 
computation,  and  the  aims  of  the  theory  of  provability.  An  example  of  die  conflict  between 
die  aim  of  constructing  a  smooth  theory  of  proofs ,  and  die  aim  of  making  effective 
computational  use  of  proofs,  has  already  been  seen.  Namely,  it  is  essential  for  the  purposes 
of  the  stability  results  mentioned  in  the  las'  paragraph  that  attention  be  restricted  to 
transformations  which  preserve  the  cxtensional  meaning  of  proofs.  On  the  other  hand,  if  one 
wishes  to  maximize  the  computational  efficiency  of  proofs  by  means  of  mechanical 
transformations,  then  one  must  use  transformations  -  such  as  pruning  -  which  change 
cxtensional  behavior;  this  was  shown  by  die  examples  given  in  chapter  4.  Thus  the 
selection  of  transformations  which  make  for  a  smooth  theory  is  different  from  the  selection 
which  is  best  for  practical  applications.  This  kind  of  conflict  between  the  aims  of  theory  and 
practice  is  of  course  common.  In  the  one  ease  general  results  arc  what  is  wanted,  and  in  the 
other  useful  techniques  -  techniques  which  may  or  may  not  have  interesting  general 
properties,  but  which  can  be  profitably  applied  by  the  use  of  human  judgement. 


B.l  Normalization  in  successor  arithmetic 

We  proceed  now  to  the  example.  I.et  Ts  be  the  the  theory  which  results  from 
restricting  the  formulation  of  arithmetic  given  in  section  3.6  to  the  language  which  has 
symbols  for  successor  and  predecessor  as  its  only  function  symbols.  Thus  die  formulas  which 
appear  in  proofs  of  Ts  will  contain  (a)  zero,  (b)  "predecessor"  and  "successor",  and  (c) 
"equals"  as  their  only  constant,  function,  and  relation  symbols,  rcpcctivcly.  The  lemmas 
which  may  appear  in  proofs  of  T's  arc  those  of  the  scheme  INDip  of  induction,  where  <p  is  a 
formula  of  the  restricted  language.  From  the  point  of  view  of  the  set  of  provable  theorems, 
Ts  is  equivalent  -  modulo  a  simple  translation  -  to  the  usual  formulation  of  successor 
arithmetic.  (Predecessor  is  included  as  a  primitive  function  because  it  simplifies  die  recursive 
proofs  of  the  induction  lemmas).  T  hus  what  we  have  called  the  "inferential  content"  of  Ts 
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is  exceedingly  small.  Indeed,  from  the  point  of  view  of  the  theory  of  provability,  Ts  is  wholly 
trivial;  it  has  an  elementary  decision  method  by  quantifier  elimination  and  a  finitist 
consistency  proof.  Nonetheless,  the  computational  content  of  Ts,  in  the  sense  of  the  set  of 
functions  which  are  computed  by  proofs  of  V3  formulas,  is  just  the  same  as  that  of  full 
arithmetic.  This  is  shown  by  the  following  theorem. 

Theorem:  Let  f  be  a  function  on  the  natural  numbers  which  is  definable  in  Godcl’s 
system  T  of  primitive  recursive  functionals  of  finite  type  [Godel  1958J.  Then  there  is  a  proof 
nf  in  Ts  of  Vx3y(x  =  y)  such  that  normalization  of  Ilf  computes  the  function  f. 

Proof:  We  wilt  show  how  to  reverse  Krcisel’s  modified  realizability  interpretation;  a  map 
T  from  terms  of  the  typed  X -calculus  to  proofs  of  Ts  will  be  described  which  has  the  property 
that  the  modified  realizability  interpertation  extracts  t  from  F(t).  The  map  makes  use  of  the 
correspondence  between  proofs  and  X-terms  which  was  explained  at  the  begining  of  chapter 
3.  The  end-formula  of  the  proof  gotten  from  a  functional  f  of  type  "0-»0"  will  be 
3x(x  =  x)D3x(x  =  x);  this  proof  can  be  easily  transformed  into  a  proof  of  Vx3y(x  =  y)  which, 
in  the  natural  sense,  computes  the  same  function.  The  theorem  follows  since  normalization 
and  modified  realizability  yield  the  same  computations.  (See  appendix  A.) 

First  of  all,  we  define  a  map  8  from  types  (of  functionals)  to  formulas  by  induction  on 
the  structure  of  types.  If  r  is  a  type,  then  6(r)  will  be  the  end- formula  of  proofs  representing 
functionals  of  type  r. 

(1)  Base  ease:  5:  0  =>  3x(x  =  x) 

(2)  8:  T—p  =>  5(r)D6(p). 

I'he  map  T  is  defined  by  induction  on  the  structure  of  terms  of  the  typed  X-caleulus.  For 
the  base  ease  we  need  to  define  P’s  behavior  on  variables  and  the  constant  zero.  Let 
{xq,x(,x2,Xj  .  .  .}  be  an  enumeration  of  the  variables  of  type  t.  Then  V  assigns  the  proof; 

[5(t  )  A  (i  =  i)] 

A  I- - 

S(r) 


to  the  variable  I'he  second  conjunct  "i  =  i"  (where  i  is  a  numeral)  serves  to  label  the 
assumption  "5(T)A(i  =  i)"  among  all  of  the  other  assumptions  ”5(r)A(j-j)"  representing 
variables  Xj  of  type  t.  Next,  F’  assigns  the  proof 


-  .  .  .  -4 

i 

I 

! 

i 

0=0  I 

31 - 

l 

3x(x  =  x)  '  j 

to  the  constant  zero.  'Ihe  remaining  clauses  of  the  inductive  defintion  are  as  follows.  j 

(1)  succ(t)  =>  succ(x)=succ(x) 

T(t)  31 - 

3x(x  =  x)  3x(x=x) 

3H -  j 

3x(x  =  x)  ] 

i 

] 

(2)  Axj.t  =>  r(t) 

*(r) 

D1 - 

(6(r)A(i  =  i))  D  S(r)  where  r  is  the  type  of  the  term  t 

(3)  t^)  =*  r(t,)  r(t2) 

DE - 

S(p)  where  r~*p  is  the  type  of  t,, 

and  r  is  the  type  of  ^ 

(4)  R(t|,t2)  ■  The  types  of  tt,t2  will  have  the  forms  r  and  0  (t  -*  r),  respectively.  Let  K 

be  the  formula  S(t).  Then  the  end-formula  of  r(t,)  is  F;,  and  the  end-formula  of  T(t2)  is 
3x(x  =  x)D(F  D  l’he  proof  r(R(t(,t2»  uses  the  induction  principle  applied  to  the  formula 

<p(x)  =  ”(x  =  x)AF".  It  will  be  convenient  to  use  the  simpler  of  the  two  formulations  of 
induction  for  <p  given  in  section  3.4,  namely,  the  recursive  proof: 
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P,,:Vx(  <p(x)) 

VH - 

[x*0]  <p(prcd  x) 

nt  AI - 

Vxy(x  =  yVx*y)  [x  =  OJ  <p(0)  x*0  A<p(prcd  x) 

VH -  SB - DE - 

x=0  V  x*0  <p(x) 

VH - 

«p(x) 

VI - 

Vxqp(x) 


n2 

Vx(x*0  A  <p(prcd  x)  D  y(x)) 

VH - 

x*0  A<p(prcd  x)  D  <p(x) 


<p(x) 


Wc  take  n  j  in  the  above  to  be: 

r(t,) 

0=0  F 

AI - 

(0=0)  AH 

and  n2  is. 


[x*0  A  ((prcd(x)=pred(x))  A  H)| 

AH - 

(prcd(x)=pred(x))  A  P' 

AH - 

F 

DP! - 

x  =  x  H 

A I - 


prcd(x)  =  pred(x) 

31 - P(t2) 

3x(x  =  x)  3x(x  =  x)  D  (H  D  F) 

dp: - 

F  D  F 


(x  =  x)  A  F 

D| - 

x^O  A  (prcd(x)  =  prcd(x)  A  F)  D  ((x  =  x)  A  F) 

VI - 

Vx.(x*0  A  (prcd(x)  =  pred(x)  A  F)  D  ((x  =  x)  A  F)) 
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This  completes  the  inductive  definition  of  f.  It  is  a  routine  matter  to  check  that 


Y(r(0)  ~  t 

where  y  is  the  modified  realizability  interpretation  as  described  in  appendix  A,  and  where 
represents  intcrconvcrtability  in  the  A-calculus.  'Ihus  for  each  function  f  of  type  0-*0  which 
is  definable  in  Godcl’s  system  T,  there  is  a  proof  17  of  3x(x  =  x)  D  3x(x  =x),  such  that,  for  all 
numerals  n,  the  result  of  normalizing 

n  =  n 

31 -  n 

3x(x  =  x)  3x(x  =  x)  D  3x(x=x) 

DU - 

3x(x  =  x) 

has  the  form 

n0 

m  =  m 

3! - 

3x(x  =  x) 

where  m  is  the  numeral  for  (Tn).  In  order  to  get  a  proof  17'  of  a  formula  of  the  form 
Vx3y(y  =  y)  which  computes  f,  simply  take 

rr  = 

x  =  x 

31 -  11 

3y(y  =  y)  3y(y  =  y)  D  3y(y  =  y) 

DU - 

3y(y  =  y) 

VI - 

Vx3y(y=y) 

'This  completes  the  proof  of  the  theorem. 

The  theorem  shows  that  the  proofs  of  successor  arithmetic,  despite  their  limited  inferential 
content,  arc  just  as  computationally  expressive  as  those  of  full  arithmetic.  The  general  reason 
for  this  is  evident  -  namely,  the  behavior  of  normalization  depends  chiefly  on  the  structure  of 
the  applications  of  induction  principles  in  a  proof,  and  is  insensitive  to  the  mathematical 
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content  of  the  formulas  to  which  induction  is  applied;  this  is  a  sense  in  which  normalization 
depends  on  the  form  rather  than  the  content  of  proofs. 

As  an  alternative  way  of  expressing  the  significance  of  the  theorem,  one  might  say  that  it 
demonstrates  that  normalization  is  a  very  bad  method  for  treatment  of  successor  arithmetic 
proofs.  There  arc  after  all  computation  procedures  for  proofs  in  this  theory  which  arc  more 
efficient  in  the  general  ease  than  normalization.  For  example,  since  all  predicates  definable 
in  successor  arithmetic  arc  decidable,  one  can  take  a  proof  of  Vx3y<p(x,y)  and  an  input  n 
and  produce  an  output  m  with  <p(n,m)  by  simple  linear  search:  <p(n,0),  <p(n,l),  <p(n,2)  ...  arc 
tested  in  turn  until  a  number  with  the  desired  property  is  found.  In  this  ease,  the  proof  serves 
only  as  a  guarantee  that  the  search  process  will  terminate.  Thus  it  may  happen  that  the  best 
computational  results  in  proof  manipulation  arc  gotten  by  making  use  not  just  of  the  form  of 
proofs  in  the  way  dial  normalization  docs,  but  also  of  the  mathematical  content  of  the 
formulas  which  appear  in  proofs.  (We  have  already  seen  an  example  of  this;  in  chapter  4,  the 
maihctn  >tical  content  of  the  bin-packing  proofs  was  used  in  the  simplex  transformations.) 
Successor  arithmetic  is  an  extreme  case,  since  one  docs  quite  well  by  ignoring  the  proof 
altogether  except  in  its  role  as  evidence  for  the  truth  its  end-formula. 


11.2  Pruning  in  successor  arithmetic 

In  the  last  section  we  were  concerned  with  normalization  without  pruning.  The  question 
which  we  address  in  this  section  is:  how  does  the  addition  of  pruning  to  the  set  of  reduction 
rules  used  in  normalization  affect  its  behavior  in  the  context  of  successor  arithmetic? 
Certainly,  pruning  can  make  a  large  difference  in  the  computational  efficiency  of  some  proofs. 
In  particular,  each  application  of  induction  in  proofs  produced  by  the  map  I'  of  the  last 
section  constitutes  a  redundancy  of  the  kind  that  pruning  removes;  in  order  to  verify  this,  the 
reader  need  only  note  that  pruning  is  directly  applicable  to  die  normal  form  of  P^  for  any  <p. 
As  a  consequence,  pruning  in  this  ease  reduces  the  complexity  of  the  functions  computed  by 
proofs  in  a  drastic  way;  the  functions  computed  by  pruned  proofs  arc  dcscribablc  by  use  of 
conditional  expressions  and  the  successor  function  alone. 

However,  it  is  possible  to  modify  the  proofs  produced  by  T  in  such  a  way  that  pruning  is 
no  longer  of  any  use.  It  follows  that  pruning  does  not  reduce  the  computational  complexity 
of  successor  arithmetic  proofs  in  the  general  ease.  To  start  with,  consider  the  clause 
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(l)succ(t)  => 


succ(x)=succ(x) 


r(t)  31 - 

3x.(x  =  x)  3x.(x  =  x) 

3E - 

3x.(x  =  x) 


in  the  inductive  definition  of  T.  Now,  since  the  assumption  "x  =  x"  is  not  used  in  the  second 
premise  of  the  above  proof,  the  pruning  rule  for  3-ciimination  given  in  section  2.7  is 
applicable.  However,  we  may  take  I'(succ(t))  to  be 

(x  =  x  ]  succ(x)=succ(x) 

SB - 

succ(x>  =  succ(x) 

r(t)  31 - 

3x.(x  =  x)  3x.(x  =  x) 

3  F-. - 

3x.(x  =  x) 

instead,  and  in  this  ease  pruning  is  not  applicable.  By  (he  same  kind  of  trickery,  it  is  possible 
to  modify  F'<f>  in  such  a  way  that  pruning  is  no  longer  of  any  use.  We  will  show  how  this  is 
done  in  a  moment,  but  first  we  wish  to  draw  some  general  conclusions  about  pruning. 

The  use  which  is  made  of  the  assumption  ”x  =  x”  in  the  proof  above  is  inessential. 
Further,  the  fact  that  it  is  inessential  would  be  immediately  recognized  by  any  person  who 
inspected  the  proof.  (For  that  matter,  any  person  would  recognize  with  equal  ease  that 
"3x.(x=x)  D  3x.(x=x)"  is  a  true  formula,  and  consequently  perceive  the  uselessness  of  the 
elaborate  proofs  generated  by  T.)  It  follows  that  an  analysis  of  dependencies  which  is  routine 
for  a  person  may  of  may  not  lie  within  the  powers  of  the  formal  pruning  operations  with 
which  we  have  been  concerned.  Hie  pruning  operations  arc  very  sensitive  to  the  formal 
details  of  the  proofs  to  which  they  arc  applied;  two  proofs  which  appear  to  be  essentially 
identical  to  a  person  may  nonetheless  behave  very  differently  under  pruning.  Nor  is  pruning 
in  any  sense  universal  among  formal  operations  for  the  removal  of  redundant  parts  of  proofs. 
One  can  invent  a  variety  of  mechanical  transformations  on  proofs  which  remove  redundancies 
of  one  kind  or  another,  hut  which  arc  useful  in  circumstances  where  pruning  fails.  To  lake 
just  one  example,  consider  the  following  operation  on  proofs  of  arithmetic: 


n 

3y<p 

VI - 

Vx3y<p 


n[x«-o] 

3y«p 

VI - 

Vx3y<p 


where  the  condition  for  the  operation  is  that  x  not  appear  free  in  <p.  This  operation,  which  in 
a  certain  weak  sense  is  sensitive  to  the  content  of  proofs,  is  effective  in  reducing  the 
computational  complexity  of  the  proofs  produced  by  the  new  version  of  the  map  T  which  we 
are  currently  constructing  -  a  map  which  produces  proofs  to  which  pruning  is  not  applicable. 

Now,  in  order  to  complete  the  definition  of  the  new  version  of  T,  we  need  to  modify  the 
proof  ri2  which  appears  as  part  of  the  proof  F <p  given  in  clause  (4)  of  the  definition  of  T. 
The  proof  n2  appears  as  part  of  the  proof  of  the  third  premise  of  an  V-climination  inference 
whose  first  premise  reads  "x  =  0  V  x*0".  However,  in  the  normal  form  of  P<p,  no  use  is 
made  of  the  assumption  "x*0"  in  the  proof  of  the  third  premise.  The  reason  for  this  is  that 
no  use  is  made  of  "x^O"  in  establishing  the  formula  "(x  =  x)AF'‘  in  fl2.  However,  in  the 
following  slightly  modified  version  of  ll2,  "x*0"  is  used,  and  consequently  pruning  is  no 
longer  applicable  toF«p. 


[x*0  A  ((pred(x)  =  prcd(x))  A  T)] 

ai: - 

(prcd(x)  =  pred(x))  A  F 

AH - - - 

F 


x  =  x  F 

AI - 


pred(.\)=prcd(x) 

31 - F(t2) 

3x(x  =  x)  3x(x  =  x)  D  (F  D  F-) 

DH - 

F  D  F 


(x  -x)  A  F 

Dl - 

x^O  A  (prcd(x)  - prcd(x)  A  F)  D  ((x  =  x)  A  F) 

V| - 

Vx.(x*0  A  (prcd(x)-prcd(x)  A  F)  D  ((x-x)  A  F')) 

where  11  j  is 
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[x*OA((pred(x)= prcd(x))AF)] 


AE - 

(pred(x) = prcd(x))  AF 

n4  AE - 

x*0  prcd(x)  =  prcd(x) 

n4  AI - 

x*0  x*OA(prcd(x)  =  pred(x))  Vx  y(x*OAy*OAprcd(x)=prcd(y)Dx  =  y 

Ai -  VE - - 

x*OAx*OA(prcd(x)  =  pred(x))  x*OAx*OA(pred(x)=prcd(x))Dx=x 

DE - 

x  =  x 


and  finally,  where  tl4  is 

[x5t0A((prcd(x)= prcd(x))AF)] 
AE - 
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